Show simple item record

dc.contributor.authorBhattacharyya, Krittika
dc.description.abstractData clustering has been received considerable attention in many applications, such as data mining, document retrieval, image segmentation and pattern classification. The enlarging volumes of information emerging by the progress of technology, makes clustering of very large scale of data a challenging task. In order to deal with the problem, many researchers try to design efficient parallel clustering algorithms. In this paper, we propose a parallel k-means++ clustering algorithm based on MapReduce, which is simple like traditional K-means, yet more powerful because the initial centroid selection process is not random. It follows a formula to plot initial centroids at equal distance and then iterates repeatedly like k-means to converge and produce final cluster. This makes this algorithm faster and parallelizing makes it more scalable. The experimental results demonstrate that the proposed algorithm can scale well and efficiently process large datasets.en_US
dc.publisherNorth Dakota State Universityen_US
dc.rightsNDSU Policy 190.6.2
dc.titleA Map Reduce Approach of K-Means++ Algorithm with Initial Equidistant Centersen_US
dc.typeMaster's paperen_US
dc.date.accessioned2015-11-09T15:23:18Z
dc.date.available2015-11-09T15:23:18Z
dc.date.issued2015
dc.identifier.urihttp://hdl.handle.net/10365/25350
dc.subject.lcshCluster analysis.en_US
dc.subject.lcshBig data.en_US
dc.subject.lcshComputer algorithms.en_US
dc.rights.urihttps://www.ndsu.edu/fileadmin/policy/190.pdf
ndsu.degreeMaster of Science (MS)en_US
ndsu.collegeEngineeringen_US
ndsu.departmentComputer Scienceen_US
ndsu.programComputer Scienceen_US
ndsu.advisorLudwig, Simone


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record