To read this content please select one of the options below:

Identifying and determining SPARQL endpoint characteristics

Johannes Lorey (Information Systems Group, Hasso Plattner Institute, Potsdam, Germany)

International Journal of Web Information Systems

ISSN: 1744-0084

Article publication date: 12 August 2014

179

Abstract

Purpose

The purpose of this study is to introduce several metrics that enable universal and fine-grained characterization of arbitrary Linked Data repositories. Publicly accessible SPARQL endpoints contain vast amounts of knowledge from a large variety of domains. However, oftentimes these endpoints are not configured to process specific workloads as efficiently as possible. Assisting users in leveraging SPARQL endpoints requires insight into functional and non-functional properties of these knowledge bases.

Design/methodology/approach

This study presents comprehensive approaches for deriving these metrics. More specifically, the study utilizes concrete SPARQL queries to determine corresponding values. Furthermore, it validates and discusses the introduced metrics through extensive evaluation on real-world SPARQL endpoints.

Findings

The evaluation determined that endpoints exhibit different characteristics. While it comes as no surprise that latency and throughput are influenced by the network infrastructure, the costs for join operations depend on a number of factors that are not obvious to a data consumer. Moreover, as the author discusses mean, median and upper quartile values, it was found both endpoints behaving consistently as well as repositories offering varying levels of performance.

Originality/value

On the one hand, the contribution of the authors work lies in assisting data consumers in evaluation of the quality of service of publicly available SPARQL endpoints. On the other hand, the performance metrics introduced in this study can also be considered as additional input features for distributed query processing frameworks. Moreover, the author provides a universal means for discerning characteristics of different SPARQL endpoints without the need of (synthetic or real-world) query workloads.

Keywords

Citation

Lorey, J. (2014), "Identifying and determining SPARQL endpoint characteristics", International Journal of Web Information Systems, Vol. 10 No. 3, pp. 226-244. https://doi.org/10.1108/IJWIS-03-2014-0007

Publisher

:

Emerald Group Publishing Limited

Copyright © 2014, Emerald Group Publishing Limited

Related articles