Accessibility navigation


Efficient clustering techniques on Hadoop and Spark

Al Ghamdi, S. and Di Fatta, G. (2019) Efficient clustering techniques on Hadoop and Spark. International Journal of Big Data Intelligence, 6 (3/4). pp. 269-290. ISSN 2053-1389

[img] Text - Accepted Version
· Restricted to Repository staff only until 5 June 2020.

617kB

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

To link to this item DOI: 10.1504/IJBDI.2019.10018592

Abstract/Summary

Software services based on large-scale distributed systems demand continuous and decentralised solutions for achieving system con- sistency and providing operational monitoring. Epidemic data aggregation algorithms provide decentralised, scalable and fault-tolerant solutions that can be used for system-wide tasks such as global state determination, monitoring and consensus. Existing continuous epidemic algorithms either periodically restart at fixed epochs or apply changes in the system state instantly producing less accurate approximation. This work introduces an innovative mechanism without fixed epochs that monitors the system state and restarts upon the detection of the system convergence or diver- gence. The mechanism makes correct aggregation with an approximation error as small as desired. The proposed solution is validated and analysed by means of simulations under static and dynamic network conditions.

Item Type:Article
Refereed:Yes
Divisions:Faculty of Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
ID Code:86456

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation