To read this content please select one of the options below:

H-SPOOL: A SPARQL-based ETL framework for OLAP over linked data with dimension hierarchy extraction

Takahiro Komamizu (University of Tsukuba, Tsukuba, Japan)
Toshiyuki Amagasa (University of Tsukuba, Tsukuba, Japan)
Hiroyuki Kitagawa (University of Tsukuba, Tsukuba, Japan)

International Journal of Web Information Systems

ISSN: 1744-0084

Article publication date: 15 August 2016

208

Abstract

Purpose

Linked data (LD) has promoted publishing information, and links published information. There are increasing number of LD datasets containing numerical data such as statistics. For this reason, analyzing numerical facts on LD has attracted attentions from diverse domains. This paper aims to support analytical processing for LD data.

Design/methodology/approach

This paper proposes a framework called H-SPOOL which provides series of SPARQL (SPARQL Protocol and RDF Query Language) queries extracting objects and attributes from LD data sets, converts them into star/snowflake schemas and materializes relevant triples as fact and dimension tables for online analytical processing (OLAP).

Findings

The applicability of H-SPOOL is evaluated using exiting LD data sets on the Web, and H-SPOOL successfully processes the LD data sets to ETL (Extract, Transform, and Load) for OLAP. Besides, experiments show that H-SPOOL reduces the number of downloaded triples comparing with existing approach.

Originality/value

H-SPOOL is the first work for extracting OLAP-related information from SPARQL endpoints, and H-SPOOL drastically reduces the amount of downloaded triples.

Keywords

Acknowledgements

This research was partly supported by the program Research and Development on Real World Big Data Integration and Analysis of the Ministry of Education, Culture, Sports, Science and Technology, Japan.

Citation

Komamizu, T., Amagasa, T. and Kitagawa, H. (2016), "H-SPOOL: A SPARQL-based ETL framework for OLAP over linked data with dimension hierarchy extraction", International Journal of Web Information Systems, Vol. 12 No. 3, pp. 359-378. https://doi.org/10.1108/IJWIS-03-2016-0014

Publisher

:

Emerald Group Publishing Limited

Copyright © 2016, Emerald Group Publishing Limited

Related articles