The purpose of this study is to introduce several metrics that enable universal and fine-grained characterization of arbitrary Linked Data repositories. Publicly accessible SPARQL endpoints contain vast amounts of knowledge from a large variety of domains. However, oftentimes these endpoints are not configured to process specific workloads as efficiently as possible. Assisting users in leveraging SPARQL endpoints requires insight into functional and non-functional properties of these knowledge bases.
This study presents comprehensive approaches for deriving these metrics. More specifically, the study utilizes concrete SPARQL queries to determine corresponding values. Furthermore, it validates and discusses the introduced metrics through extensive evaluation on real-world SPARQL endpoints.
The evaluation determined that endpoints exhibit different characteristics. While it comes as no surprise that latency and throughput are influenced by the network infrastructure, the costs for join operations depend on a number of factors that are not obvious to a data consumer. Moreover, as the author discusses mean, median and upper quartile values, it was found both endpoints behaving consistently as well as repositories offering varying levels of performance.
On the one hand, the contribution of the authors work lies in assisting data consumers in evaluation of the quality of service of publicly available SPARQL endpoints. On the other hand, the performance metrics introduced in this study can also be considered as additional input features for distributed query processing frameworks. Moreover, the author provides a universal means for discerning characteristics of different SPARQL endpoints without the need of (synthetic or real-world) query workloads.
Lorey, J. (2014), "Identifying and determining SPARQL endpoint characteristics", International Journal of Web Information Systems, Vol. 10 No. 3, pp. 226-244. https://doi.org/10.1108/IJWIS-03-2014-0007Download as .RIS
Emerald Group Publishing Limited
Copyright © 2014, Emerald Group Publishing Limited