Epidemic K-Means clustering

Di Fatta, Giuseppe; Blasa, Francesco; Cafiero, Simone; Fortino, Giancarlo

Download

Full text not archived in this repository.

Advice

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Tools

Lists

Di Fatta, G., Blasa, F., Cafiero, S. and Fortino, G. (2011) Epidemic K-Means clustering. In: 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW), 2011, Vancouver, BC , pp. 151-158. doi: 10.1109/ICDMW.2011.76

Abstract/Summary

The K-Means algorithm for cluster analysis is one of the most influential and popular data mining methods. Its straightforward parallel formulation is well suited for distributed memory systems with reliable interconnection networks. However, in large-scale geographically distributed systems the straightforward parallel algorithm can be rendered useless by a single communication failure or high latency in communication paths. This work proposes a fully decentralised algorithm (Epidemic K-Means) which does not require global communication and is intrinsically fault tolerant. The proposed distributed K-Means algorithm provides a clustering solution which can approximate the solution of an ideal centralised algorithm over the aggregated data as closely as desired. A comparative performance analysis is carried out against the state of the art distributed K-Means algorithms based on sampling methods. The experimental analysis confirms that the proposed algorithm is a practical and accurate distributed K-Means implementation for networked systems of very large and extreme scale.

Altmetric Badge

Dimensions Badge

Additional Information	Print ISBN: 9781467300056 Issue Date: 11-11 Dec. 2011 On page(s): 151 - 158
Item Type	Conference or Workshop Item (Paper)
URI	https://centaur.reading.ac.uk/id/eprint/27053
Identification Number/DOI	10.1109/ICDMW.2011.76
Refereed	Yes
Divisions	Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
Uncontrolled Keywords	Distributed clustering , K-Means , epidemic protocols , gossip-based aggregation , peer-to-peer data mining
Additional Information	Print ISBN: 9781467300056 Issue Date: 11-11 Dec. 2011 On page(s): 151 - 158
Publisher	IEEE
Publisher Statement
Download/View statistics	View download statistics for this item

Deposit Details

CORE (COnnecting REpositories)

University Staff: Request a correction | Centaur Editors: Update this record

Date Deposited:	14 Mar 2012 10:34	Date item deposited into CentAUR
Last Modified:	20 Jan 2026 12:41	Date item last modified