Accessibility navigation


Real-time feature selection technique with concept drift detection using adaptive micro-clusters for data stream mining

Hammoodi, M. S., Stahl, F. ORCID: https://orcid.org/0000-0002-4860-0203 and Badii, A. (2018) Real-time feature selection technique with concept drift detection using adaptive micro-clusters for data stream mining. Knowledge-Based Systems, 161. pp. 205-239. ISSN 0950-7051

[img]
Preview
Text - Accepted Version
· Available under License Creative Commons Attribution Non-commercial No Derivatives.
· Please see our End User Agreement before downloading.

7MB

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

To link to this item DOI: 10.1016/j.knosys.2018.08.007

Abstract/Summary

Data streams are unbounded, sequential data instances that are generated with high Velocity. Classifying sequential data instances is a very challenging problem in machine learning with applications in network intrusion detection, financial markets and applications requiring real-time sensor-networks-based situation assessment. Data stream classification is concerned with the automatic labelling of unseen instances from the stream in real-time. For this the classifier needs to adapt to concept drifts and can only have a single pass through the data if the stream is fast moving. This research paper presents work on a real-time pre-processing technique, in particular feature tracking. The feature tracking technique is designed to improve Data Stream Mining (DSM) classification algorithms by enabling and optimising real-time feature selection. The technique is based on tracking adaptive statistical summaries of the data and class label distributions, known as Micro-Clusters. Currently the technique is able to detect concept drifts and identify which features have been influential in the drift.

Item Type:Article
Refereed:Yes
Divisions:Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
ID Code:78678
Uncontrolled Keywords:Data Stream Mining, real-time Feature Selection, Concept Drift Detection
Publisher:Elsevier

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation