2013 IEEE 29th International Conference on Data Engineering Workshops (ICDEW 2013)
Download PDF

Abstract

MapReduce has emerged as a key component of large scale data analysis in the cloud. However, it presents challenges for SPARQL query processing because of the absence of traditional join optimization machinery like statistics, indexes and techniques for translation of join-intensive workloads to efficient MapReduce workflows. Further, MapReduce is primarily a batch processing paradigm. Therefore, it is plausible that many workloads will include a batch of queries or new queries could be generated from given queries e.g. due to query rewriting of inferencing queries. Consequently, the issue of multi-query optimization deserves some focus and this paper lays out a vision for rule-based multi-query optimization based on a recently proposed data model and algebra, Nested TripleGroup Data Model and Algebra, for efficient SPARQL query processing on MapReduce.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles