Search results
1 – 10 of 225eXtensible Markup Language (XML) data are data which are not necessarily constrained by a schema, XML is fast emerging as a standard for data representation and exchange on the…
Abstract
Purpose
eXtensible Markup Language (XML) data are data which are not necessarily constrained by a schema, XML is fast emerging as a standard for data representation and exchange on the world wide web, the ability to intelligently query XML data becomes increasingly important. Some XML graphical query languages for XML data have been proposed but they are either too complex or too limited in the power of expression and in their use. The purpose of this paper is to propose a recursive graphical query language for querying and restructuring XML data (RGQLX). The expressive power of RGQLX is comparable to Fixpoint. RGQLX language is a multi‐sorted graphical language integrating grouping, aggregate functions, nested queries and recursion.
Design/methodology/approach
The methodology emphasizes on RGQLX's development which is base of G‐XML data model syntax to express a wide variety of XML queries, ranging from simple selection, to expressive data transformations involving grouping, aggregation and sorting. RGQLX allows users to express recursive visual queries in an elegant manner. RGQLX has an operational semantics based on the annotated XML, which serves to express queries and data trees in form of XML. The paper presents an algorithm to achieve the matching between data and query trees after translating a query tree into annotated XML.
Findings
Developed and demonstrated were: a G‐XML model; recursive queries; annotated XML for the semantic operations and a matching algorithm.
Research limitations/implications
The future research work on RGQLX language will be expanding it to include recursive aggregations.
Practical implications
The algorithms/approaches proposed can be easily integrated in any commercial product to enhance the performance of XML query languages.
Originality/value
The proposed work integrates various novel techniques for XML query syntax/semantic into a single language with a suitable matching algorithm. The power of this proposal is in the class of Fixpoint queries.
Details
Keywords
Mourad Ykhlef and Sarra Alqahtani
The rapid development of Extensible Markup Language (XML) from a mere data exchange format to a universal syntax for encoding domain specific information increases the need of new…
Abstract
Purpose
The rapid development of Extensible Markup Language (XML) from a mere data exchange format to a universal syntax for encoding domain specific information increases the need of new query languages specifically visualized to address the characteristics of XML. Such languages should be able not only to extract information from XML documents, but also to apply powerful restructuring operators, based on a well‐defined semantics. Moreover, XML queries should be natural to write and understand, as also end‐users are expected to access the large XML information bases supporting their businesses. The purpose of this paper is to propose a new graphical query language for XML (GQLX) for querying and restructuring XML data.
Design/methodology/approach
The methodology emphasizes on GQLX's development, which is based on G‐XML data model syntax to express a wide variety of XML queries, ranging from simple selection to expressive data transformations involving grouping, aggregation and sorting. GQLX has an operational semantics based on the annotated XML, which serves to express queries and data trees in the form of XML. The paper also presents an algorithm to achieve the matching between data and query trees after translating them into annotated XML.
Findings
Developed and demonstrated were: a G‐XML syntax; annotated XML for the semantic operations and a matching algorithm.
Research limitations/implications
The future research work on this language lies in expanding it to include recursion and nested queries.
Practical implications
The algorithms/approaches proposed can be implemented to enhance the performance of the XML query language.
Originality/value
The proposed work integrates various novel techniques for XML query syntax/semantic into a single language with a suitable matching algorithm.
Details
Keywords
Chao Wang, Jie Lu and Guangquan Zhang
Matching relevant ontology data for integration is vitally important as the amount of ontology data increases along with the evolving Semantic web, in which data are published…
Abstract
Purpose
Matching relevant ontology data for integration is vitally important as the amount of ontology data increases along with the evolving Semantic web, in which data are published from different individuals or organizations in a decentralized environment. For any domain that has developed a suitable ontology, its ontology annotated data (or simply ontology data) from different sources often overlaps and needs to be integrated. The purpose of this paper is to develop intelligent web ontology data matching method and framework for data integration.
Design/methodology/approach
This paper develops an intelligent matching method to solve the issue of ontology data matching. Based on the matching method, it also proposes a flexible peer‐to‐peer framework to address the issue of ontology data integration in a distributed Semantic web environment.
Findings
The proposed matching method is different from existing data matching or merging methods applied to data warehouse in that it employs a machine learning approach and more similarity measurements by exploring ontology features.
Research limitations/implications
The proposed method and framework will be further tested for some more complicated real cases in the future.
Originality/value
The experiments show that this proposed intelligent matching method increases ontology data matching accuracy.
Details
Keywords
José L. Navarro‐Galindo and José Samos
Nowadays, the use of WCMS (web content management systems) is widespread. The conversion of this infrastructure into its semantic equivalent (semantic WCMS) is a critical issue…
Abstract
Purpose
Nowadays, the use of WCMS (web content management systems) is widespread. The conversion of this infrastructure into its semantic equivalent (semantic WCMS) is a critical issue, as this enables the benefits of the semantic web to be extended. The purpose of this paper is to present a FLERSA (Flexible Range Semantic Annotation) for flexible range semantic annotation.
Design/methodology/approach
A FLERSA is presented as a user‐centred annotation tool for Web content expressed in natural language. The tool has been built in order to illustrate how a WCMS called Joomla! can be converted into its semantic equivalent.
Findings
The development of the tool shows that it is possible to build a semantic WCMS through a combination of semantic components and other resources such as ontologies and emergence technologies, including XML, RDF, RDFa and OWL.
Practical implications
The paper provides a starting‐point for further research in which the principles and techniques of the FLERSA tool can be applied to any WCMS.
Originality/value
The tool allows both manual and automatic semantic annotations, as well as providing enhanced search capabilities. For manual annotation, a new flexible range markup technique is used, based on the RDFa standard, to support the evolution of annotated Web documents more effectively than XPointer. For automatic annotation, a hybrid approach based on machine learning techniques (Vector‐Space Model + n‐grams) is used to determine the concepts that the content of a Web document deals with (from an ontology which provides a taxonomy), based on previous annotations that are used as a training corpus.
Details
Keywords
This paper aims to propose a system for the semantic annotation of audio‐visual media objects, which are provided in the documentary domain. It presents the system's architecture…
Abstract
Purpose
This paper aims to propose a system for the semantic annotation of audio‐visual media objects, which are provided in the documentary domain. It presents the system's architecture, a manual annotation tool, an authoring tool and a search engine for the documentary experts. The paper discusses the merits of a proposed approach of evolving semantic network as the basis for the audio‐visual content description.
Design/methodology/approach
The author demonstrates how documentary media can be semantically annotated, and how this information can be used for the retrieval of the documentary media objects. Furthermore, the paper outlines the underlying XML schema‐based content description structures of the proposed system.
Findings
Currently, a flexible organization of documentary media content description and the related media data is required. Such an organization requires the adaptable construction in the form of a semantic network. The proposed approach provides semantic structures with the capability to change and grow, allowing an ongoing task‐specific process of inspection and interpretation of source material. The approach also provides technical memory structures (i.e. information nodes), which represent the size, duration, and technical format of the physical audio‐visual material of any media type, such as audio, video and 3D animation.
Originality/value
The proposed approach (architecture) is generic and facilitates the dynamic use of audio‐visual material using links, enabling the connection from multi‐layered information nodes to data on a temporal, spatial and spatial‐temporal level. It enables the semantic connection between information nodes using typed relations, thus structuring the information space on a semantic as well as syntactic level. Since the description of media content holds constant for the associated time interval, the proposed system can handle multiple content descriptions for the same media unit and also handle gaps. The results of this research will be valuable not only for documentary experts but for anyone with a need to manage dynamically audiovisual content in an intelligent way.
Details
Keywords
Estimating the sizes of query results and intermediate results is crucial to many aspects of query processing. All database systems rely on the use of cardinality estimates to…
Abstract
Purpose
Estimating the sizes of query results and intermediate results is crucial to many aspects of query processing. All database systems rely on the use of cardinality estimates to choose the cheapest execution plan. In principle, the problem of cardinality estimation is more complicated in the Extensible Markup Language (XML) domain than the relational domain. The purpose of this paper is to present a novel framework for estimating the cardinality of XQuery expressions as well as their sub‐expressions. Additionally, this paper proposes a novel XQuery cardinality estimation benchmark. The main aim of this benchmark is to establish the basis of comparison between the different estimation approaches in the XQuery domain.
Design/methodology/approach
As a major innovation, the paper exploits the relational algebraic infrastructure to provide accurate estimation in the context of XML and XQuery domains. In the proposed framework, XQuery expressions are translated into an equivalent relational algebraic plans and then using a well defined set of inference rules and a set of special properties of the algebraic plan, this framework is able to provide high‐accurate estimation for XQuery expressions.
Findings
This paper is believed to be the first which provides a uniform framework to estimate the cardinality of more powerful XML querying capabilities using XQuery expressions as well as their sub‐expressions. It exploits the relational algebraic infrastructure to provide accurate estimation in the context of XML and XQuery domains. Moreover, the proposed framework can act as a meta‐model through its ability to incorporate different summarized XML structures and different histogram techniques which allows the model designers to achieve their targets by focusing their effort on designing or selecting the adequate techniques for them. In addition, this paper proposes benchmark for XQuery cardinality estimation systems. The proposed benchmark distinguishes itself from the other existing XML benchmarks in its focus on establishing the basis for comparing the different estimation approaches in the XML domain in terms of their accuracy of the estimations and their completeness in handling different XML querying features.
Research limitations/implications
The current status of this proposed XQuery cardinality estimations framework does not support the estimation of the queries over the order information of the source XML documents and does not support non‐numeric predicates.
Practical implications
The experiments of this XQuery cardinality estimation system demonstrate its effectiveness and show high‐accurate estimation results. Utilizing the cardinality estimation properties during the SQL translation of XQuery expression results in an average improvement of 20 percent on the performance of their execution times.
Originality/value
This paper presents a novel framework for estimating the cardinality of XQuery expressions as well as its sub‐expressions. A novel XQuery cardinality estimation benchmark is introduced to establish the basis of comparison between the different estimation approaches in the XQuery domain.
Details
Keywords
The self‐describing nature of data marked up using extensible markup language (XML) allows the XML document itself to act in a manner similar to a database, but without the large…
Abstract
The self‐describing nature of data marked up using extensible markup language (XML) allows the XML document itself to act in a manner similar to a database, but without the large file sizes and proprietary software generally associated with database applications. XML data can be made directly available to users using a variety of methods. This paper explores methods for both server‐side and client‐side processing and display of XML‐encoded data, using an annotated bibliography as an example.
Details
Keywords
Efficient processing of XML queries is critical for XML data management and related applications. Previously proposed techniques are unsatisfactory. The purpose of this paper is…
Abstract
Purpose
Efficient processing of XML queries is critical for XML data management and related applications. Previously proposed techniques are unsatisfactory. The purpose of this paper is to present Determined – a new prototype system designed for XML query processing and optimization from a system perspective. With Determined, a number of novel techniques for XML query processing are proposed and demonstrated.
Design/methodology/approach
The methodology emphasizes on query pattern minimization, logic‐level optimization, and efficient query execution. Accordingly, three lines of investigation have been pursued in the context of Determined: XML tree pattern query (TPQ) minimization; logic‐level XML query optimization utilizing deterministic transformation; and specialized algorithms for fast XML query execution.
Findings
Developed and demonstrated were: a runtime optimal and powerful algorithm for XML TPQ minimization; a unique logic‐level XML query optimization approach that solely pursues deterministic query transformation; and a group of specialized algorithms for XML query evaluation.
Research limitations/implications
The experiments conducted so far are still preliminary. Further in‐depth, thorough experiments thus are expected, ideally carried out in the setting of a real‐world XML DBMS system.
Practical implications
The techniques/approaches proposed can be adapted to real‐world XML database systems to enhance the performance of XML query processing.
Originality/value
The reported work integrates various novel techniques for XML query processing/optimization into a single system, and the findings are presented from a system perspective.
Details
Keywords
Dimitrios A. Koutsomitropoulos
Effective synthesis of learning material is a multidimensional problem, which often relies on handpicking approaches and human expertise. Sources of educational content exist in a…
Abstract
Purpose
Effective synthesis of learning material is a multidimensional problem, which often relies on handpicking approaches and human expertise. Sources of educational content exist in a variety of forms, each offering proprietary metadata information and search facilities. This paper aims to show that it is possible to harvest scholarly resources from various repositories of open educational resources (OERs) in a federated manner. In addition, their subject can be automatically annotated using ontology inference and standard thematic terminologies.
Design/methodology/approach
Based on a semantic interpretation of their metadata, authors can align external collections and maintain them in a shared knowledge pool known as the Learning Object Ontology Repository (LOOR). The author leverages the LOOR and show that it is possible to search through various educational repositories’ metadata and amalgamate their semantics into a common learning object (LO) ontology. The author then proceeds with automatic subject classification of LOs using keyword expansion and referencing standard taxonomic vocabularies for thematic classification, expressed in SKOS.
Findings
The approach for automatic subject classification simply takes advantage of the implicit information in the searching and selection process and combines them with expert knowledge in the domain of reference (SKOS thesauri). This is shown to improve recall by a considerable factor, while precision remains unaffected.
Originality/value
To the best of the author’s knowledge, the idea of subject classification of LOs through the reuse of search query terms combined with SKOS-based matching and expansion has not been investigated before in a federated scholarly setting.
Details
Keywords
To propose methods for expressing semantics and operating semantics in largely distributed environment, such as peer‐to‐peer (P2P) based digital libraries (DLs) where…
Abstract
Purpose
To propose methods for expressing semantics and operating semantics in largely distributed environment, such as peer‐to‐peer (P2P) based digital libraries (DLs) where heterogeneous schemas may exist and the relationships among them must be explicated for better performance in information searching.
Design/methodology/approach
In conventional solutions, a mediator is adopted to create and maintain the matching between relevant terms such that distinct but relevant metadata schemas can be integrated according to the mapping relationships in the mediator. However, such solutions suffer some problems originated from the static matching in mediator. This paper proposes to use facts to express the relationships among heterogeneous schemas and conduct the reasoning dynamically by using inference engines.
Findings
It is justified to use facts and inference engines to express and operate the semantics among heterogeneous but relevant information resources. The user can choose to convert only part of the XML document into facts if she can unpeel deeply nested XML tags. Additionally, it is possible for the user to manually edit (assert, update or retract) the facts as needed in the reasoning.
Research limitations/implications
The study assumes that peers are clustered according to shared topics or interest. An exhaust evaluation has not been conducted.
Practical implications
Each node can publish its schema to the involved peer community such that other peers can automatically discover the specific schema. A local matchmaking engine is adopted as well in order to automatically generate the relations between its own schema and the retrieved ones.
Originality/value
This paper provides a framework for semantic data integration in P2P networks.
Details