Search results

1 – 6 of 6
Article
Publication date: 4 April 2008

Dunren Che and Wen‐Chi Hou

Efficient processing of XML queries is critical for XML data management and related applications. Previously proposed techniques are unsatisfactory. The purpose of this paper is…

Abstract

Purpose

Efficient processing of XML queries is critical for XML data management and related applications. Previously proposed techniques are unsatisfactory. The purpose of this paper is to present Determined – a new prototype system designed for XML query processing and optimization from a system perspective. With Determined, a number of novel techniques for XML query processing are proposed and demonstrated.

Design/methodology/approach

The methodology emphasizes on query pattern minimization, logic‐level optimization, and efficient query execution. Accordingly, three lines of investigation have been pursued in the context of Determined: XML tree pattern query (TPQ) minimization; logic‐level XML query optimization utilizing deterministic transformation; and specialized algorithms for fast XML query execution.

Findings

Developed and demonstrated were: a runtime optimal and powerful algorithm for XML TPQ minimization; a unique logic‐level XML query optimization approach that solely pursues deterministic query transformation; and a group of specialized algorithms for XML query evaluation.

Research limitations/implications

The experiments conducted so far are still preliminary. Further in‐depth, thorough experiments thus are expected, ideally carried out in the setting of a real‐world XML DBMS system.

Practical implications

The techniques/approaches proposed can be adapted to real‐world XML database systems to enhance the performance of XML query processing.

Originality/value

The reported work integrates various novel techniques for XML query processing/optimization into a single system, and the findings are presented from a system perspective.

Details

International Journal of Web Information Systems, vol. 4 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 28 September 2007

Dunren Che

Tree pattern is at the core of XML queries. The tree patterns in XML queries typically contain redundancies, especially when broad integrity constraints (ICs) are present and…

Abstract

Purpose

Tree pattern is at the core of XML queries. The tree patterns in XML queries typically contain redundancies, especially when broad integrity constraints (ICs) are present and considered. Apparently, tree pattern minimization has great significance for efficient XML query processing. Although various minimization schemes/algorithms have been proposed, none of them can exploit broad ICs for thoroughly minimizing the tree patterns in XML queries. The purpose of this research is to develop an innovative minimization scheme and provide a novel implementation algorithm.

Design/methodology/approach

Query augmentation/expansion was taken as a necessary first‐step by most prior approaches to acquire XML query pattern minimization under the presence of certain ICs. The adopted augmentation/expansion is also the course for the typical O(n4) time‐complexity of the proposed algorithms. This paper presents an innovative approach called allying to effectively circumvent the otherwise necessary augmentation step and to retain the time complexity of the implementation algorithm within the optimal, i.e. O(n2). Meanwhile, the graph simulation concept is adapted and generalized to a three‐tier definition scheme so that broader ICs are incorporated.

Findings

The innovative allying minimization approach is identified and an effective implementation algorithm named AlliedMinimize is developed. This algorithm is both runtime optimal – taking O(n2) time – and most powerful in terms of the broadness of constraints it can exploit for XML query pattern minimization. Experimental study confirms the validity of the proposed approach and algorithm.

Research limitations/implications

Though the algorithm AlliedMinimize is so far the most powerful XML query pattern minimization algorithm, it does not incorporate all potential ICs existing in the context of XML. Effectively integrating this innovative minimization scheme into a fully‐fledged XML query optimizer remains to be investigated in the future.

Practical implications

In practice, Allying and AlliedMinimize can be used to achieve a kind of quick optimization for XML queries via fast minimization of the tree patterns involved in XML queries under broad ICs.

Originality/value

This paper presents a novel scheme and an efficient algorithm for XML query pattern minimization under broad ICs.

Details

International Journal of Web Information Systems, vol. 3 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 28 August 2009

Zhewei Jiang, Cheng Luo, Wen‐Chi Hou, Dunren Che and Qiang Zhu

The purpose of this paper is to provide an efficient algorithm for Extensible Markup Language (XML) twig query evaluation.

Abstract

Purpose

The purpose of this paper is to provide an efficient algorithm for Extensible Markup Language (XML) twig query evaluation.

Design/methodology/approach

A single‐phase holistic twig pattern matching method based on the TwigStack algorithm is proposed. The method applies a novel stack structure to preserve the holisticity of the twig matches. Twig matches rooted at elements that are currently in the root stack are output directly.

Findings

Without generating individual path matches as intermediate results, the method is able to avoid the storage and output/input of the individual path matches, and totally eliminate the potentially time‐consuming merging operation. Experimental results demonstrate the applicability and advantages of our approach.

Originality/value

The paper proposes an efficient XML twig query evaluation algorithm, which by both theoretical analyses and empirical studies demonstrates its advantages over the current state‐of‐the‐art algorithm TwigStack.

Details

International Journal of Web Information Systems, vol. 5 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

Content available
Article
Publication date: 28 September 2007

Ismail Khalil Ibrahim, David Tanier and Eric Pardede

373

Abstract

Details

International Journal of Web Information Systems, vol. 3 no. 3
Type: Research Article
ISSN: 1744-0084

Content available
Article
Publication date: 28 August 2009

Ismail Khalil

370

Abstract

Details

International Journal of Web Information Systems, vol. 5 no. 3
Type: Research Article
ISSN: 1744-0084

Article
Publication date: 21 December 2021

Laouni Djafri

This work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P…

384

Abstract

Purpose

This work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P networks, clusters, clouds computing or other technologies.

Design/methodology/approach

In the age of Big Data, all companies want to benefit from large amounts of data. These data can help them understand their internal and external environment and anticipate associated phenomena, as the data turn into knowledge that can be used for prediction later. Thus, this knowledge becomes a great asset in companies' hands. This is precisely the objective of data mining. But with the production of a large amount of data and knowledge at a faster pace, the authors are now talking about Big Data mining. For this reason, the authors’ proposed works mainly aim at solving the problem of volume, veracity, validity and velocity when classifying Big Data using distributed and parallel processing techniques. So, the problem that the authors are raising in this work is how the authors can make machine learning algorithms work in a distributed and parallel way at the same time without losing the accuracy of classification results. To solve this problem, the authors propose a system called Dynamic Distributed and Parallel Machine Learning (DDPML) algorithms. To build it, the authors divided their work into two parts. In the first, the authors propose a distributed architecture that is controlled by Map-Reduce algorithm which in turn depends on random sampling technique. So, the distributed architecture that the authors designed is specially directed to handle big data processing that operates in a coherent and efficient manner with the sampling strategy proposed in this work. This architecture also helps the authors to actually verify the classification results obtained using the representative learning base (RLB). In the second part, the authors have extracted the representative learning base by sampling at two levels using the stratified random sampling method. This sampling method is also applied to extract the shared learning base (SLB) and the partial learning base for the first level (PLBL1) and the partial learning base for the second level (PLBL2). The experimental results show the efficiency of our solution that the authors provided without significant loss of the classification results. Thus, in practical terms, the system DDPML is generally dedicated to big data mining processing, and works effectively in distributed systems with a simple structure, such as client-server networks.

Findings

The authors got very satisfactory classification results.

Originality/value

DDPML system is specially designed to smoothly handle big data mining classification.

Details

Data Technologies and Applications, vol. 56 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

1 – 6 of 6