To read the full version of this content please select one of the options below:

Algebra‐based XQuery cardinality estimation

Sherif Sakr (NICTA, University of New South Wales, Sydney, Australia)

International Journal of Web Information Systems

ISSN: 1744-0084

Article publication date: 4 April 2008

Abstract

Purpose

Estimating the sizes of query results and intermediate results is crucial to many aspects of query processing. All database systems rely on the use of cardinality estimates to choose the cheapest execution plan. In principle, the problem of cardinality estimation is more complicated in the Extensible Markup Language (XML) domain than the relational domain. The purpose of this paper is to present a novel framework for estimating the cardinality of XQuery expressions as well as their sub‐expressions. Additionally, this paper proposes a novel XQuery cardinality estimation benchmark. The main aim of this benchmark is to establish the basis of comparison between the different estimation approaches in the XQuery domain.

Design/methodology/approach

As a major innovation, the paper exploits the relational algebraic infrastructure to provide accurate estimation in the context of XML and XQuery domains. In the proposed framework, XQuery expressions are translated into an equivalent relational algebraic plans and then using a well defined set of inference rules and a set of special properties of the algebraic plan, this framework is able to provide high‐accurate estimation for XQuery expressions.

Findings

This paper is believed to be the first which provides a uniform framework to estimate the cardinality of more powerful XML querying capabilities using XQuery expressions as well as their sub‐expressions. It exploits the relational algebraic infrastructure to provide accurate estimation in the context of XML and XQuery domains. Moreover, the proposed framework can act as a meta‐model through its ability to incorporate different summarized XML structures and different histogram techniques which allows the model designers to achieve their targets by focusing their effort on designing or selecting the adequate techniques for them. In addition, this paper proposes benchmark for XQuery cardinality estimation systems. The proposed benchmark distinguishes itself from the other existing XML benchmarks in its focus on establishing the basis for comparing the different estimation approaches in the XML domain in terms of their accuracy of the estimations and their completeness in handling different XML querying features.

Research limitations/implications

The current status of this proposed XQuery cardinality estimations framework does not support the estimation of the queries over the order information of the source XML documents and does not support non‐numeric predicates.

Practical implications

The experiments of this XQuery cardinality estimation system demonstrate its effectiveness and show high‐accurate estimation results. Utilizing the cardinality estimation properties during the SQL translation of XQuery expression results in an average improvement of 20 percent on the performance of their execution times.

Originality/value

This paper presents a novel framework for estimating the cardinality of XQuery expressions as well as its sub‐expressions. A novel XQuery cardinality estimation benchmark is introduced to establish the basis of comparison between the different estimation approaches in the XQuery domain.

Keywords

Citation

Sakr, S. (2008), "Algebra‐based XQuery cardinality estimation", International Journal of Web Information Systems, Vol. 4 No. 1, pp. 6-47. https://doi.org/10.1108/17440080810865611

Publisher

:

Emerald Group Publishing Limited

Copyright © 2008, Emerald Group Publishing Limited