To read this content please select one of the options below:

A query language for selecting, harmonizing, and aggregating heterogeneous XML data

Turkka Näppilä (School of Information Sciences, University of Tampere, Tampere, Finland)
Katja Moilanen (School of Information Sciences, University of Tampere, Tampere, Finland)
Timo Niemi (School of Information Sciences, University of Tampere, Tampere, Finland)

International Journal of Web Information Systems

ISSN: 1744-0084

Article publication date: 5 April 2011

518

Abstract

Purpose

The purpose of this paper is to introduce an expressive query language, called relational XML query language (RXQL), capable of dealing with heterogeneous Extensible Markup Language (XML) documents in data‐centric applications. In RXQL, data harmonization (i.e. the removal of heterogeneous factors from XML data) is integrated with typical data‐centric features (e.g. grouping, ordering, and aggregation).

Design/methodology/approach

RXQL is based on the XML relation representation, developed in the authors' previous work. This is a novel approach to unambiguously represent semistructured data relationally, which makes it possible in RXQL to manipulate XML data in a tuple‐oriented way, while XML data are typically manipulated in a path‐oriented way.

Findings

The user is able to describe the result of an RXQL query straightforwardly based on non‐XML syntax. The analysis of this description, through the mechanism developed in this paper, affords the automatic construction of the query result. This feature increases significantly the declarativeness of RXQL compared to the path‐oriented XML languages where the user needs to control the construction of the result extensively.

Practical implications

The authors' formal specification of the construction of the query result can be considered as an abstract implementation of RXQL.

Originality/value

RXQL is a declarative query language capable of integrating data harmonization seamlessly with other data‐centric features in the manipulation of heterogeneous XML data. So far, these kinds of XML query languages have been missing. Obviously, the expressive power of RXQL can be achieved by computationally complete XML languages, such as XQuery. However, these are not actual query languages, and the query formulation in them usually presupposes programming skills that are beyond the ordinary end‐user.

Keywords

Citation

Näppilä, T., Moilanen, K. and Niemi, T. (2011), "A query language for selecting, harmonizing, and aggregating heterogeneous XML data", International Journal of Web Information Systems, Vol. 7 No. 1, pp. 62-99. https://doi.org/10.1108/17440081111125662

Publisher

:

Emerald Group Publishing Limited

Copyright © 2011, Emerald Group Publishing Limited

Related articles