Accessibility navigation


PMCRI: a parallel modular classification rule induction framework

Stahl, F., Bramer, M. and Adda, M. (2009) PMCRI: a parallel modular classification rule induction framework. In: Machine Learning and Data Mining in Pattern Recognition. Lecture Notes in Computer Science (5632). Springer, pp. 148-162. ISBN 9783642030697

Full text not archived in this repository.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

To link to this item DOI: 10.1007/978-3-642-03070-3_12

Abstract/Summary

In a world where massive amounts of data are recorded on a large scale we need data mining technologies to gain knowledge from the data in a reasonable time. The Top Down Induction of Decision Trees (TDIDT) algorithm is a very widely used technology to predict the classification of newly recorded data. However alternative technologies have been derived that often produce better rules but do not scale well on large datasets. Such an alternative to TDIDT is the PrismTCS algorithm. PrismTCS performs particularly well on noisy data but does not scale well on large datasets. In this paper we introduce Prism and investigate its scaling behaviour. We describe how we improved the scalability of the serial version of Prism and investigate its limitations. We then describe our work to overcome these limitations by developing a framework to parallelise algorithms of the Prism family and similar algorithms. We also present the scale up results of a first prototype implementation.

Item Type:Book or Report Section
Refereed:Yes
Divisions:Faculty of Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
ID Code:30146
Additional Information:Proceedings of the 6th International Conference, MLDM 2009, Leipzig, Germany, July 23-25, 2009.
Publisher:Springer

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation