A Map Reduce Approach of K-Means++ Algorithm with Initial Equidistant Centers

Bhattacharyya, Krittika

Author/Creator

Bhattacharyya, Krittika

More Information

Show full item record

View/Open

A Map Reduce Approach of K-Means++ Algorithm with Initial Equidistant Centers (901.2Kb)

Abstract

Data clustering has been received considerable attention in many applications, such as data mining, document retrieval, image segmentation and pattern classification. The enlarging volumes of information emerging by the progress of technology, makes clustering of very large scale of data a challenging task. In order to deal with the problem, many researchers try to design efficient parallel clustering algorithms. In this paper, we propose a parallel k-means++ clustering algorithm based on MapReduce, which is simple like traditional K-means, yet more powerful because the initial centroid selection process is not random. It follows a formula to plot initial centroids at equal distance and then iterates repeatedly like k-means to converge and produce final cluster. This makes this algorithm faster and parallelizing makes it more scalable. The experimental results demonstrate that the proposed algorithm can scale well and efficiently process large datasets.

URI

http://hdl.handle.net/10365/25350

Collections

Computer Science Masters Papers