Accessibility navigation

A novel cloud based elastic framework for big data preprocessing

Dawelbeit, O. and McCrindle, R. (2014) A novel cloud based elastic framework for big data preprocessing. In: 6th Computer Science and Electronic Engineering Conference (CEEC), 2014, September 25-26, Essex, UK,

Text - Accepted Version
· Please see our End User Agreement before downloading.


It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

To link to this item DOI: 10.1109/CEEC.2014.6958549


A number of analytical big data services based on the cloud computing paradigm such as Amazon Redshift and Google Bigquery have recently emerged. These services are based on columnar databases rather than traditional Relational Database Management Systems (RDBMS) and are able to analyse massive datasets in mere seconds. This has led many organisations to retain and analyse their massive logs, sensory or marketing datasets, which were previously discarded due to the inability to either store or analyse them. Although these big data services have addressed the issue of big data analysis, the ability to efficiently de-normalise and prepare this data to a format that can be imported into these services remains a challenge. This paper describes and implements a novel, generic and scalable cloud based elastic framework for Big Data Preprocessing (BDP). Since the approach described by this paper is entirely based on cloud computing it is also possible to measure the overall cost incurred by these preprocessing activities.

Item Type:Conference or Workshop Item (Paper)
Divisions:Life Sciences > School of Biological Sciences > Department of Bio-Engineering
ID Code:69738
Uncontrolled Keywords:Big data, Cloud computing, Program processors, Google, Runtime, Educational institutions, Computer science


Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation