Optimisation techniques for parallel K-Means on MapReduce

Al Ghamdi, Sami; Di Fatta, Giuseppe; Stahl, Frederic

Download

Full text not archived in this repository.

Advice

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Tools

Lists

Al Ghamdi, S., Di Fatta, G. and Stahl, F. ORCID: https://orcid.org/0000-0002-4860-0203 (2015) Optimisation techniques for parallel K-Means on MapReduce. In: Proceedings of the 8th International Conference on Internet and Distributed Computing Systems - Volume 9258, pp. 193-200.

Abstract/Summary

The K-Means algorithm is one the most efficient and widely used algorithms for clustering data. However, K-Means performance tends to get slower as data grows larger in size. Moreover, the rapid increase in the size of data has motivated the scientific and industrial communities to develop novel technologies that meet the needs of storing, managing, and analysing large-scale datasets known as Big Data. This paper describes the implementation of parallel K-Means on the MapReduce framework, which is a distributed framework best known for its reliability in processing large-scale datasets. Moreover, a detailed analysis of the effect of distance computations on the performance of K-Means on MapReduce is introduced. Finally, two optimisation techniques are suggested to accelerate K-Means on MapReduce by reducing distance computations per iteration to achieve the same deterministic results.

Item Type	Conference or Workshop Item (Paper)
URI	https://centaur.reading.ac.uk/id/eprint/68356
Official URL	http://dx.doi.org/10.1007/978-3-319-23237-9_17
Refereed	Yes
Divisions	Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
Uncontrolled Keywords	Clustering, K-Means, Mapreduce, Parallel K-Means
Publisher	Springer-Verlag New York, Inc.
Download/View statistics	View download statistics for this item

Deposit Details

CORE (COnnecting REpositories)

University Staff: Request a correction | Centaur Editors: Update this record

Date Deposited:	19 Dec 2016 14:13	Date item deposited into CentAUR
Last Modified:	23 Jan 2024 19:55	Date item last modified