Accessibility navigation


A text mining framework for Big Data

Pavlopoulou, N., Abushwashi, A., Stahl, F. and Scibetta, V. (2017) A text mining framework for Big Data. Expert Update, 17 (1). ISSN 1465-4091 (Special Issue on the 1st BCS SGAI Workshop on Data Stream Mining Techniques and Applications)

[img]
Preview
Text - Published Version
· Please see our End User Agreement before downloading.

194kB

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Official URL: http://www.expertupdate.org/

Abstract/Summary

Text Mining is the ability to generate knowledge (insight) from text. This is a challenging task, especially when the target text databases are very large. Big Data has attracted much attention lately, both from academia and industry. A number of distributed databases, search engines and frameworks have been developed to handle the memory and time constraints, which are required to process a large amount of data. However, there is no open-source end-to-end framework that can combinenearreal-timeandbatchprocessingofingestedbigtextualdataalongwith user-defined options and provision of specific, reliable insight from the data. This is important as this way new unstructured information is made accessible in near real-time, more personalised customer products can be created and novel unusual patterns can be found and actioned on quickly. This work focuses on a proprietary complete near real-time automated classification framework for unstructured data with the use of Natural Language Processing and Machine Learning algorithms on Apache Spark. The evaluation of our framework shows that it achieves a comparable accuracy with respect to some of the best approaches presented in the literature.

Item Type:Article
Refereed:Yes
Divisions:Faculty of Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
ID Code:70108
Publisher:BCS Specialist Group on Artifical Intelligence

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation