Accessibility navigation


Efficient dictionary compression for processing RDF big data using Google BigQuery

Dawelbeit, O. and McCrindle, R. (2017) Efficient dictionary compression for processing RDF big data using Google BigQuery. In: IEEE GLOBECOM 2016, December 4-8th 2016, Washington DC.

[img]
Preview
Text - Accepted Version
· Please see our End User Agreement before downloading.

295kB

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Official URL: http://doi.org/10.1109/GLOCOM.2016.7841775

Abstract/Summary

The Resource Description Framework (RDF) data model, is used on the Web to express billions of structured statements in a wide range of topics, including government, publications, life sciences, etc. Consequently, processing and storing this data requires the provision of high specification systems, both in terms of storage and computational capabilities. On the other hand, cloud-based big data services such as Google BigQuery can be used to store and query this data without any upfront investment. Google BigQuery pricing is based on the size of the data being stored or queried, but given that RDF statements contain long Uniform Resource Identifiers (URIs), the cost of query and storage of RDF big data can increase rapidly. In this paper we present and evaluate a novel and efficient dictionary compression algorithm which is faster, generates small dictionaries that can fit in memory and results in better compression rate when compared with other large scale RDF dictionary compression. Consequently, our algorithm also reduces the BigQuery storage and query cost

Item Type:Conference or Workshop Item (Paper)
Refereed:Yes
Divisions:Life Sciences > School of Biological Sciences > Department of Bio-Engineering
ID Code:69737

Downloads

Downloads per month over past year

University Staff: Request a correction | Centaur Editors: Update this record

Page navigation