Accessibility navigation


Epidemic K-Means clustering

Di Fatta, G., Blasa, F., Cafiero, S. and Fortino, G. (2011) Epidemic K-Means clustering. In: 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW), 2011, Vancouver, BC , pp. 151-158.

Full text not archived in this repository.

To link to this article DOI: 10.1109/ICDMW.2011.76

Abstract/Summary

The K-Means algorithm for cluster analysis is one of the most influential and popular data mining methods. Its straightforward parallel formulation is well suited for distributed memory systems with reliable interconnection networks. However, in large-scale geographically distributed systems the straightforward parallel algorithm can be rendered useless by a single communication failure or high latency in communication paths. This work proposes a fully decentralised algorithm (Epidemic K-Means) which does not require global communication and is intrinsically fault tolerant. The proposed distributed K-Means algorithm provides a clustering solution which can approximate the solution of an ideal centralised algorithm over the aggregated data as closely as desired. A comparative performance analysis is carried out against the state of the art distributed K-Means algorithms based on sampling methods. The experimental analysis confirms that the proposed algorithm is a practical and accurate distributed K-Means implementation for networked systems of very large and extreme scale.

Item Type:Conference or Workshop Item (Paper)
Refereed:Yes
Divisions:Faculty of Science > School of Systems Engineering
ID Code:27053
Uncontrolled Keywords:Distributed clustering , K-Means , epidemic protocols , gossip-based aggregation , peer-to-peer data mining
Additional Information:Print ISBN: 9781467300056 Issue Date: 11-11 Dec. 2011 On page(s): 151 - 158
Publisher:IEEE
Publisher Statement:

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation