Search results

1 – 10 of over 8000
Article
Publication date: 19 June 2009

Imam Machdi, Toshiyuki Amagasa and Hiroyuki Kitagawa

The purpose of this paper is to propose Extensible Markup Language (XML) data partitioning schemes that can cope with static and dynamic allocation for parallel holistic twig…

Abstract

Purpose

The purpose of this paper is to propose Extensible Markup Language (XML) data partitioning schemes that can cope with static and dynamic allocation for parallel holistic twig joins: grid metadata model for XML (GMX) and streams‐based partitioning method for XML (SPX).

Design/methodology/approach

GMX exploits the relationships between XML documents and query patterns to perform workload‐aware partitioning of XML data. Specifically, the paper constructs a two‐dimensional model with a document dimension and a query dimension in which each object in a dimension is composed from XML metadata related to the dimension. GMX provides a set of XML data partitioning methods that include document clustering, query clustering, document‐based refinement, query‐based refinement, and query‐path refinement, thereby enabling XML data partitioning based on the static information of XML metadata. In contrast, SPX explores the structural relationships of query elements and a range‐containment property of XML streams to generate partitions and allocate them to cluster nodes on‐the‐fly.

Findings

GMX provides several salient features: a set of partition granularities that balance workloads of query processing costs among cluster nodes statically; inter‐query parallelism as well as intra‐query parallelism at multiple extents; and better parallel query performance when all estimated queries are executed simultaneously to meet their probability of query occurrences in the system. SPX also offers the following features: minimal computation time to generate partitions; balancing skewed workloads dynamically on the system; producing higher intra‐query parallelism; and gaining better parallel query performance.

Research limitations/implications

The current status of the proposed XML data partitioning schemes does not take into account XML data updates, e.g. new XML documents and query pattern changes submitted by users on the system.

Practical implications

Note that effectiveness of the XML data partitioning schemes mainly relies on the accuracy of the cost model to estimate query processing costs. The cost model must be adjusted to reflect characteristics of a system platform used in the implementation.

Originality/value

This paper proposes novel schemes of conducting XML data partitioning to achieve both static and dynamic workload balance.

Details

International Journal of Web Information Systems, vol. 5 no. 2
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 1 August 1997

A. Macfarlane, S.E. Robertson and J.A. Mccann

The progress of parallel computing in Information Retrieval (IR) is reviewed. In particular we stress the importance of the motivation in using parallel computing for text…

Abstract

The progress of parallel computing in Information Retrieval (IR) is reviewed. In particular we stress the importance of the motivation in using parallel computing for text retrieval. We analyse parallel IR systems using a classification defined by Rasmussen and describe some parallel IR systems. We give a description of the retrieval models used in parallel information processing. We describe areas of research which we believe are needed.

Details

Journal of Documentation, vol. 53 no. 3
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 12 June 2017

Shabia Shabir Khan and S.M.K. Quadri

As far as the treatment of most complex issues in the design is concerned, approaches based on classical artificial intelligence are inferior compared to the ones based on…

Abstract

Purpose

As far as the treatment of most complex issues in the design is concerned, approaches based on classical artificial intelligence are inferior compared to the ones based on computational intelligence, particularly this involves dealing with vagueness, multi-objectivity and good amount of possible solutions. In practical applications, computational techniques have given best results and the research in this field is continuously growing. The purpose of this paper is to search for a general and effective intelligent tool for prediction of patient survival after surgery. The present study involves the construction of such intelligent computational models using different configurations, including data partitioning techniques that have been experimentally evaluated by applying them over realistic medical data set for the prediction of survival in pancreatic cancer patients.

Design/methodology/approach

On the basis of the experiments and research performed over the data belonging to various fields using different intelligent tools, the authors infer that combining or integrating the qualification aspects of fuzzy inference system and quantification aspects of artificial neural network can prove an efficient and better model for prediction. The authors have constructed three soft computing-based adaptive neuro-fuzzy inference system (ANFIS) models with different configurations and data partitioning techniques with an aim to search capable predictive tools that could deal with nonlinear and complex data. After evaluating the models over three shuffles of data (training set, test set and full set), the performances were compared in order to find the best design for prediction of patient survival after surgery. The construction and implementation of models have been performed using MATLAB simulator.

Findings

On applying the hybrid intelligent neuro-fuzzy models with different configurations, the authors were able to find its advantage in predicting the survival of patients with pancreatic cancer. Experimental results and comparison between the constructed models conclude that ANFIS with Fuzzy C-means (FCM) partitioning model provides better accuracy in predicting the class with lowest mean square error (MSE) value. Apart from MSE value, other evaluation measure values for FCM partitioning prove to be better than the rest of the models. Therefore, the results demonstrate that the model can be applied to other biomedicine and engineering fields dealing with different complex issues related to imprecision and uncertainty.

Originality/value

The originality of paper includes framework showing two-way flow for fuzzy system construction which is further used by the authors in designing the three simulation models with different configurations, including the partitioning methods for prediction of patient survival after surgery. Several experiments were carried out using different shuffles of data to validate the parameters of the model. The performances of the models were compared using various evaluation measures such as MSE.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 10 no. 2
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 22 June 2010

Imam Machdi, Toshiyuki Amagasa and Hiroyuki Kitagawa

The purpose of this paper is to propose general parallelism techniques for holistic twig join algorithms to process queries against Extensible Markup Language (XML) databases on a…

Abstract

Purpose

The purpose of this paper is to propose general parallelism techniques for holistic twig join algorithms to process queries against Extensible Markup Language (XML) databases on a multi‐core system.

Design/methodology/approach

The parallelism techniques comprised data and task parallelism. As for data parallelism, the paper adopted the stream‐based partitioning for XML to partition XML data as the basis of parallelism on multiple CPU cores. The XML data partitioning was performed in two levels. The first level was to create buckets for creating data independence and balancing loads among CPU cores; each bucket was assigned onto a CPU core. Within each bucket, the second level of XML data partitioning was performed to create finer partitions for providing finer parallelism. Each CPU core performed the holistic twig join algorithm on each finer partition of its own in parallel with other CPU cores. In task parallelism, the holistic twig join algorithm was decomposed into two main tasks, which were pipelined to create parallelism. The first task adopted the data parallelism technique and their outputs were transferred to the second task periodically. Since data transfers incurred overheads, the size of each data transfer needed to be estimated cautiously for achieving optimal performance.

Findings

The data and task parallelism techniques contribute to good performance especially for queries having complex structures and/or higher values of query selectivity. The performance of data parallelism can be further improved by task parallelism. Significant performance improvement is attained by queries having higher selectivity because more outputs computed by the second task is performed in parallel with the first task.

Research limitations/implications

The proposed parallelism techniques primarily deals with executing a single long‐running query for intra‐query parallelism, partitioning XML data on‐the‐fly, and allocating partitions on CPU cores statically. During the parallel execution, presumably there are no such dynamic XML data updates.

Practical implications

The effectiveness of the proposed parallel holistic twig joins relies fundamentally on some system parameter values that can be obtained from a benchmark of the system platform.

Originality/value

The paper proposes novel techniques to increase parallelism by combining techniques of data and task parallelism for achieving high performance. To the best of the author's knowledge, this is the first paper of parallelizing the holistic twig join algorithms on a multi‐core system.

Details

International Journal of Web Information Systems, vol. 6 no. 2
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 5 October 2012

Burcu Tunga and Metin Demiralp

The plain High Dimensional Model Representation (HDMR) method needs Dirac delta type weights to partition the given multivariate data set for modelling an interpolation problem…

Abstract

Purpose

The plain High Dimensional Model Representation (HDMR) method needs Dirac delta type weights to partition the given multivariate data set for modelling an interpolation problem. Dirac delta type weight imposes a different importance level to each node of this set during the partitioning procedure which directly effects the performance of HDMR. The purpose of this paper is to develop a new method by using fluctuation free integration and HDMR methods to obtain optimized weight factors needed for identifying these importance levels for the multivariate data partitioning and modelling procedure.

Design/methodology/approach

A common problem in multivariate interpolation problems where the sought function values are given at the nodes of a rectangular prismatic grid is to determine an analytical structure for the function under consideration. As the multivariance of an interpolation problem increases, incompletenesses appear in standard numerical methods and memory limitations in computer‐based applications. To overcome the multivariance problems, it is better to deal with less‐variate structures. HDMR methods which are based on divide‐and‐conquer philosophy can be used for this purpose. This corresponds to multivariate data partitioning in which at most univariate components of the Plain HDMR are taken into consideration. To obtain these components there exist a number of integrals to be evaluated and the Fluctuation Free Integration method is used to obtain the results of these integrals. This new form of HDMR integrated with Fluctuation Free Integration also allows the Dirac delta type weight usage in multivariate data partitioning to be discarded and to optimize the weight factors corresponding to the importance level of each node of the given set.

Findings

The method developed in this study is applied to the six numerical examples in which there exist different structures and very encouraging results were obtained. In addition, the new method is compared with the other methods which include Dirac delta type weight function and the obtained results are given in the numerical implementations section.

Originality/value

The authors' new method allows an optimized weight structure in modelling to be determined in the given problem, instead of imposing the use of a certain weight function such as Dirac delta type weight. This allows the HDMR philosophy to have the chance of a flexible weight utilization in multivariate data modelling problems.

Details

Engineering Computations, vol. 29 no. 7
Type: Research Article
ISSN: 0264-4401

Keywords

Article
Publication date: 27 September 2019

Giuseppe Orlando, Rosa Maria Mininni and Michele Bufalo

The purpose of this study is to suggest a new framework that we call the CIR#, which allows forecasting interest rates from observed financial market data even when rates are…

Abstract

Purpose

The purpose of this study is to suggest a new framework that we call the CIR#, which allows forecasting interest rates from observed financial market data even when rates are negative. In doing so, we have the objective is to maintain the market volatility structure as well as the analytical tractability of the original CIR model.

Design/methodology/approach

The novelty of the proposed methodology consists in using the CIR model to forecast the evolution of interest rates by an appropriate partitioning of the data sample and calibration. The latter is performed by replacing the standard Brownian motion process in the random term of the model with normally distributed standardized residuals of the “optimal” autoregressive integrated moving average (ARIMA) model.

Findings

The suggested model is quite powerful for the following reasons. First, the historical market data sample is partitioned into sub-groups to capture all the statistically significant changes of variance in the interest rates. An appropriate translation of market rates to positive values was included in the procedure to overcome the issue of negative/near-to-zero values. Second, this study has introduced a new way of calibrating the CIR model parameters to each sub-group partitioning the actual historical data. The standard Brownian motion process in the random part of the model is replaced with normally distributed standardized residuals of the “optimal” ARIMA model suitably chosen for each sub-group. As a result, exact CIR fitted values to the observed market data are calculated and the computational cost of the numerical procedure is considerably reduced. Third, this work shows that the CIR model is efficient and able to follow very closely the structure of market interest rates (especially for short maturities that, notoriously, are very difficult to handle) and to predict future interest rates better than the original CIR model. As a measure of goodness of fit, this study obtained high values of the statistics R2 and small values of the root of the mean square error for each sub-group and the entire data sample.

Research limitations/implications

A limitation is related to the specific dataset as we are examining the period around the 2008 financial crisis for about 5 years and by using monthly data. Future research will show the predictive power of the model by extending the dataset in terms of frequency and size.

Practical implications

Improved ability to model/forecast interest rates.

Originality/value

The original value consists in turning the CIR from modeling instantaneous spot rates to forecasting any rate of the yield curve.

Details

Studies in Economics and Finance, vol. 37 no. 2
Type: Research Article
ISSN: 1086-7376

Keywords

Article
Publication date: 1 June 2004

K.L. Lo and Haji Izham Haji Zainal Abidin

This paper describes voltage collapse in power system networks and how it could lead to a collapse of the whole system. Discusses the effect of machine learning and artificial…

1195

Abstract

This paper describes voltage collapse in power system networks and how it could lead to a collapse of the whole system. Discusses the effect of machine learning and artificial intelligence, leading to new methods. Spotlight, the fuzzy decision tree (FDT) method and its application to voltage collapse assessments. Concludes that FDT can identify and group data sets, giving a new understanding of its application in voltage collapse analysis.

Details

COMPEL - The international journal for computation and mathematics in electrical and electronic engineering, vol. 23 no. 2
Type: Research Article
ISSN: 0332-1649

Keywords

Article
Publication date: 30 November 2022

Luh Putu Eka Yani and Ammar Aamer

Demand foresting significantly impacts supply chain (SC) design and recovery planning. The more accurate the demand forecast, the better the recovery plan and the more resilient…

Abstract

Purpose

Demand foresting significantly impacts supply chain (SC) design and recovery planning. The more accurate the demand forecast, the better the recovery plan and the more resilient the SC. Given the paucity of research about machine learning (ML) applications and the pharmaceutical industry’s need for disruptive techniques, this study aims to investigate the applicability and effect of ML algorithms on demand forecasting. More specifically, the study identifies machine learning algorithms applicable to demand forecasting and assess the forecasting accuracy of using ML in the pharmaceutical SC.

Design/methodology/approach

This research used a single-case explanatory methodology. The exploratory approach examined the study’s objective and the acquisition of information technology impact. In this research, three experimental designs were carried out to test training data partitioning, apply ML algorithms and test different ranges of exclusion factors. The Konstanz Information Miner platform was used in this research.

Findings

Based on the analysis, this study could show that the most accurate training data partition was 80%, with random forest and simple tree outperforming other algorithms regarding demand forecasting accuracy. The improvement in demand forecasting accuracy ranged from 10% to 41%.

Research limitations/implications

This study provides practical and theoretical insights into the importance of applying disruptive techniques such as ML to improve the resilience of the pharmaceutical supply design in such a disruptive time.

Originality/value

The finding of this research contributes to the limited knowledge about ML applications in demand forecasting. This is manifested in the knowledge advancement about the different ML algorithms applicable in demand forecasting and their effectiveness. Besides, the study at hand offers guidance for future research in expanding and analyzing the applicability and effectiveness of ML algorithms in the different sectors of the SC.

Details

International Journal of Pharmaceutical and Healthcare Marketing, vol. 17 no. 1
Type: Research Article
ISSN: 1750-6123

Keywords

Article
Publication date: 1 April 1994

Michael Buckland and Christian Plaunt

This article examines the structure and components of information storage and retrieval systems and information filtering systems. Analysis of the tasks performed in such…

Abstract

This article examines the structure and components of information storage and retrieval systems and information filtering systems. Analysis of the tasks performed in such selection systems leads to the identification of 13 components. Eight are necessarily present in all such systems, mechanized or not; the others may, but need not be, present. The authors argue that all selection systems can be represented in terms of combinations of these components. The components are of only two types: representations of data objects and functions that operate on them. Further, the functional components, or rules, reduce to two basic types: 1) transformation, making or modifying the members of a set of representations, and 2) sorting or partitioning. The representational transformations may be in the form of copies, excerpts, descriptions, abstractions, or mere identifying references. By partitioning, we mean dividing a set of objects by using matching, sorting, ranking, selecting, and other logically equivalent operations. The typical multiplicity of knowledge sources and of system vocabularies is noted. Some of the implications for the study, use, and design of information storage and retrieval systems are discussed.

Details

Library Hi Tech, vol. 12 no. 4
Type: Research Article
ISSN: 0737-8831

Article
Publication date: 24 June 2019

Xiao Li, Hongtai Cheng and Xiaoxiao Liang

Learning from demonstration (LfD) provides an intuitive way for non-expert persons to teach robots new skills. However, the learned motion is typically fixed for a given scenario…

Abstract

Purpose

Learning from demonstration (LfD) provides an intuitive way for non-expert persons to teach robots new skills. However, the learned motion is typically fixed for a given scenario, which brings serious adaptiveness problem for robots operating in the unstructured environment, such as avoiding an obstacle which is not presented during original demonstrations. Therefore, the robot should be able to learn and execute new behaviors to accommodate the changing environment. To achieve this goal, this paper aims to propose an improved LfD method which is enhanced by an adaptive motion planning technique.

Design/methodology/approach

The LfD is based on GMM/GMR method, which can transform original off-line demonstrations into a compressed probabilistic model and recover robot motion based on the distributions. The central idea of this paper is to reshape the probabilistic model according to on-line observation, which is realized by the process of re-sampling, data partition, data reorganization and motion re-planning. The re-planned motions are not unique. A criterion is proposed to evaluate the fitness of each motion and optimize among the candidates.

Findings

The proposed method is implemented in a robotic rope disentangling task. The results show that the robot is able to complete its task while avoiding randomly distributed obstacles and thereby verify the effectiveness of the proposed method. The main contributions of the proposed method are avoiding unforeseen obstacles in the unstructured environment and maintaining crucial aspects of the motion which guarantee to accomplish a skill/task successfully.

Originality/value

Traditional methods are intrinsically based on motion planning technique and treat the off-line training data as a priori probability. The paper proposes a novel data-driven solution to achieve motion planning for LfD. When the environment changes, the off-line training data are revised according to external constraints and reorganized to generate new motion. Compared to traditional methods, the novel data-driven solution is concise and efficient.

Details

Industrial Robot: the international journal of robotics research and application, vol. 46 no. 4
Type: Research Article
ISSN: 0143-991X

Keywords

1 – 10 of over 8000