Search results

1 – 10 of over 75000

View access options

Article

Publication date: 19 February 2021

Improving the performance of query processing using proposed resilient distributed processing technique

Resilient distributed processing technique (RDPT), in which mapper and reducer are simplified with the Spark contexts and support distributed parallel query processing.

HTML

PDF (1.2 MB)

Downloads

130

Abstract

Purpose

Resilient distributed processing technique (RDPT), in which mapper and reducer are simplified with the Spark contexts and support distributed parallel query processing.

Design/methodology/approach

The proposed work is implemented with Pig Latin with Spark contexts to develop query processing in a distributed environment.

Findings

Query processing in Hadoop influences the distributed processing with the MapReduce model. MapReduce caters to the works on different nodes with the implementation of complex mappers and reducers. Its results are valid for some extent size of the data.

Originality/value

Pig supports the required parallel processing framework with the following constructs during the processing of queries: FOREACH; FLATTEN; COGROUP.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 14 no. 2

Type: Research Article

DOI:

ISSN: 1756-378X

Keywords

View access options

Article

Publication date: 28 September 2007

Answering queries over incomplete data stream histories

Alasdair J.G. Gray, Werner Nutt and M. Howard Williams

Distributed data streams are an important topic of current research. In such a setting, data values will be missed, e.g. due to network errors. This paper aims to allow this…

HTML

PDF (161 KB)

Downloads

331

Abstract

Purpose

Distributed data streams are an important topic of current research. In such a setting, data values will be missed, e.g. due to network errors. This paper aims to allow this incompleteness to be detected and overcome with either the user not being affected or the effects of the incompleteness being reported to the user.

Design/methodology/approach

A model for representing the incomplete information has been developed that captures the information that is known about the missing data. Techniques for query answering involving certain and possible answer sets have been extended so that queries over incomplete data stream histories can be answered.

Findings

It is possible to detect when a distributed data stream is missing one or more values. When such data values are missing there will be some information that is known about the data and this is stored in an appropriate format. Even when the available data are incomplete, it is possible in some circumstances to answer a query completely. When this is not possible, additional meta‐data can be returned to inform the user of the effects of the incompleteness.

Research limitations/implications

The techniques and models proposed in this paper have only been partially implemented.

Practical implications

The proposed system is general and can be applied wherever there is a need to query the history of distributed data streams. The work in this paper enables the system to answer queries when there are missing values in the data.

Originality/value

This paper presents a general model of how to detect, represent, and answer historical queries over incomplete distributed data streams.

Details

International Journal of Web Information Systems, vol. 3 no. 1/2

Type: Research Article

DOI:

ISSN: 1744-0084

Keywords

View access options

Article

Publication date: 1 August 2005

Data caching and query processing in MANETs

Jinbao Li, Yingshu Li, My T. Thai and Jianzhong Li

This paper investigates query processing in MANETs. Cache techniques and multi‐join database operations are studied. For data caching, a group‐caching strategy is proposed. Using…

HTML

PDF (439 KB)

Downloads

302

Abstract

This paper investigates query processing in MANETs. Cache techniques and multi‐join database operations are studied. For data caching, a group‐caching strategy is proposed. Using the cache and the index of the cached data, queries can be processed at a single node or within the group containing this single node. For multi‐join, a cost evaluation model and a query plan generation algorithm are presented. Query cost is evaluated based on the parameters including the size of the transmitted data, the transmission distance and the query cost at each single node. According to the evaluations, the nodes on which the query should be executed and the join order are determined. Theoretical analysis and experiment results show that the proposed group‐caching based query processing and the cost based join strategy are efficient in MANETs. It is suitable for the mobility, the disconnection and the multi‐hop features of MANETs. The communication cost between nodes is reduced and the efficiency of the query is improved greatly.

Details

International Journal of Pervasive Computing and Communications, vol. 1 no. 3

Type: Research Article

DOI:

ISSN: 1742-7371

Keywords

View access options

Article

Publication date: 15 May 2019

On the development of cat swarm metaheuristic using distributed learning strategies and the applications

Usha Manasi Mohapatra, Babita Majhi and Alok Kumar Jagadev

The purpose of this paper is to propose distributed learning-based three different metaheuristic algorithms for the identification of nonlinear systems. The proposed algorithms…

HTML

PDF (586 KB)

Downloads

Abstract

Purpose

The purpose of this paper is to propose distributed learning-based three different metaheuristic algorithms for the identification of nonlinear systems. The proposed algorithms are experimented in this study to address problems for which input data are available at different geographic locations. In addition, the models are tested for nonlinear systems with different noise conditions. In a nutshell, the suggested model aims to handle voluminous data with low communication overhead compared to traditional centralized processing methodologies.

Design/methodology/approach

Population-based evolutionary algorithms such as genetic algorithm (GA), particle swarm optimization (PSO) and cat swarm optimization (CSO) are implemented in a distributed form to address the system identification problem having distributed input data. Out of different distributed approaches mentioned in the literature, the study has considered incremental and diffusion strategies.

Findings

Performances of the proposed distributed learning-based algorithms are compared for different noise conditions. The experimental results indicate that CSO performs better compared to GA and PSO at all noise strengths with respect to accuracy and error convergence rate, but incremental CSO is slightly superior to diffusion CSO.

Originality/value

This paper employs evolutionary algorithms using distributed learning strategies and applies these algorithms for the identification of unknown systems. Very few existing studies have been reported in which these distributed learning strategies are experimented for the parameter estimation task.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 12 no. 2

Type: Research Article

DOI:

ISSN: 1756-378X

Keywords

View access options

Article

Publication date: 1 March 1990

ONLINE SEARCHING AIDS: A REVIEW OF FRONT ENDS, GATEWAYS AND OTHER INTERFACES

EFTHIMIS N. EFTHIMIADIS

This review reports on the current state and the potential of tools and systems designed to aid online searching, referred to here as online searching aids. Intermediary…

HTML

PDF (3 MB)

Downloads

239

Abstract

This review reports on the current state and the potential of tools and systems designed to aid online searching, referred to here as online searching aids. Intermediary mechanisms are examined in terms of the two stage model, i.e. end‐user, intermediary, ‘raw database’, and different forms of user — system interaction are discussed. The evolution of the terminology of online searching aids is presented with special emphasis on the expert/non‐expert division. Terms defined include gateways, front‐end systems, intermediary systems and post‐processing. The alternative configurations that such systems can have and the approaches to the design of the user interface are discussed. The review then analyses the functions of online searching aids, i.e. logon procedures, access to hosts, help features, search formulation, query reformulation, database selection, uploading, downloading and post‐processing. Costs are then briefly examined. The review concludes by looking at future trends following recent developments in computer science and elsewhere. Distributed expert based information systems (debis), the standard generalised mark‐up language (SGML), the client‐server model, object‐orientation and parallel processing are expected to influence, if they have not done so already, the design and implementation of future online searching aids.

Details

Journal of Documentation, vol. 46 no. 3

Type: Research Article

DOI:

ISSN: 0022-0418

View access options

Article

Publication date: 1 December 1994

Database Requirements for CIM Applications

Gerti Kappel and Stefan Vieweg

Changes in market and production profiles require a more flexibleconcept in manufacturing. Computer integrated manufacturing (CIM)describes an integrative concept for joining…

HTML

PDF (104 KB)

Downloads

1393

Abstract

Changes in market and production profiles require a more flexible concept in manufacturing. Computer integrated manufacturing (CIM) describes an integrative concept for joining business and manufacturing islands. In this context, database technology is the key technology for implementing the CIM philosophy. However, CIM applications are more complex and thus more demanding than traditional database applications such as business and administrative applications. Systematically analyses the database requirements for CIM applications including business and manufacturing tasks. Special emphasis is given on integration requirements due to the distributed, partly isolated nature of CIM applications developed over the years. An illustrative sampling of current efforts in the database community to meet the challenge of non‐standard applications such as CIM is presented.

Details

Integrated Manufacturing Systems, vol. 5 no. 4/5

Type: Research Article

DOI:

ISSN: 0957-6061

Keywords

View access options

Article

Publication date: 1 December 1999

Distributed processing and Windows NT: The ideal infrastructure for library consortia

Michael J. Frasciello and John Richardson

Library consortia require automation systems that adequately address the following questions: Can the system support centralized and decentralized server configurations? Does the…

HTML

PDF (70 KB)

Downloads

813

Abstract

Library consortia require automation systems that adequately address the following questions: Can the system support centralized and decentralized server configurations? Does the software’s architecture accommodate changing requirements? Does the system provide seamless behavior? Contends that the evolution of distributed enterprise computing technology has brought the library automation industry to a new realization that automation systems engineered with an n‐tiered client/server architecture will best meet the needs of library consortia. Standards‐based distributed processing is the key to the n‐tier client/server paradigm. While some technologies (i.e. UNIX) provide for a single standard on which to define distributed processing, only Microsoft’s Windows NT supports multiple standards. From Microsoft’s perspective, the Windows NT operating system is the middle tier of the n‐tier client/server environment. To truly exploit the middle tier, an application must utilize Microsoft Transaction Server (MTS). Native Windows NT automation systems utilizing MTS are best positioned for the future because MTS assumes an n‐tier architecture with the middle tier (or tiers) deployed on Windows NT Server. “Native” NT applications are built in and for Microsoft Windows NT. Library consortia considering a native Windows NT automation system should evaluate the system’s distributed processing capabilities to determine its applicability to their needs. Library consortia can test a vendor’s claim to scalable distributed processing by asking three questions: Is the software dependent on the type of data being used? Does the software support logical and physical separation (distribution)? Does the software require a systems‐shut down to perform database or application updates?

Details

Library Consortium Management: An International Journal, vol. 1 no. 3/4

Type: Research Article

DOI:

ISSN: 1466-2760

Keywords

View access options

Article

Publication date: 1 November 1986

Distributed Computing — A Challenge to Personnel Management

R.A. Hamilton

The computer systems developed during the 1960s and 1970s made very little impact on management decision. Management Information System design was constrained by three factors �…

HTML

PDF (369 KB)

Downloads

141

Abstract

The computer systems developed during the 1960s and 1970s made very little impact on management decision. Management Information System design was constrained by three factors — the technology was large‐scale and inevitably centralised and controlled by data processing staff; the systems were designed by specialist staff who rarely understood the business requirements; and managers themselves had little knowledge or “hands‐on” experience of computers. In the 1980s a greater awareness of the need for planning and better use of personnel information, coupled with the development of distributed processing systems, has presented personnel management with opportunities to use computing technology as a means of increasing the professionalism of practising personnel managers. Effective use will only occur if the implementation of technology is matched by appraisal of skills and organisation within personnel departments. Staff will need a minimum level of computing expertise and some managers will need skills in modelling, particularly financial modelling. The relationship between personnel and data processing needs careful redefining to build a link between the two and data processing staff need to design and communicate an end‐user strategy.

Details

Industrial Management & Data Systems, vol. 86 no. 11/12

Type: Research Article

DOI:

ISSN: 0263-5577

Keywords

View access options

Article

Publication date: 6 January 2022

How does cloud computing help businesses to manage big data issues

Ahmad Latifian

Big data has posed problems for businesses, the Information Technology (IT) sector and the science community. The problems posed by big data can be effectively addressed using…

HTML

PDF (617 KB)

Downloads

865

Abstract

Purpose

Big data has posed problems for businesses, the Information Technology (IT) sector and the science community. The problems posed by big data can be effectively addressed using cloud computing and associated distributed computing technology. Cloud computing and big data are two significant past-year problems that allow high-efficiency and competitive computing tools to be delivered as IT services. The paper aims to examine the role of the cloud as a tool for managing big data in various aspects to help businesses.

Design/methodology/approach

This paper delivers solutions in the cloud for storing, compressing, analyzing and processing big data. Hence, articles were divided into four categories: articles on big data storage, articles on big data processing, articles on analyzing and finally, articles on data compression in cloud computing. This article is based on a systematic literature review. Also, it is based on a review of 19 published papers on big data.

Findings

From the results, it can be inferred that cloud computing technology has features that can be useful for big data management. Challenging issues are raised in each section. For example, in storing big data, privacy and security issues are challenging.

Research limitations/implications

There were limitations to this systematic review. The first limitation is that only English articles were reviewed. Also, articles that matched the keywords were used. Finally, in this review, authoritative articles were reviewed, and slides and tutorials were avoided.

Practical implications

The research presents new insight into the business value of cloud computing in interfirm collaborations.

Originality/value

Previous research has often examined other aspects of big data in the cloud. This article takes a new approach to the subject. It allows big data researchers to comprehend the various aspects of big data management in the cloud. In addition, setting an agenda for future research saves time and effort for readers searching for topics within big data.

Details

Kybernetes, vol. 51 no. 6

Type: Research Article

DOI:

ISSN: 0368-492X

Keywords

View access options

Article

Publication date: 1 August 2016

Integration and optimization of multiple big data processing platforms

Bao-Rong Chang, Hsiu-Fen Tsai, Yun-Che Tsai, Chin-Fu Kuo and Chi-Chung Chen

The purpose of this paper is to integrate and optimize a multiple big data processing platform with the features of high performance, high availability and high scalability in big…

HTML

PDF (3.3 MB)

Downloads

501

Abstract

Purpose

The purpose of this paper is to integrate and optimize a multiple big data processing platform with the features of high performance, high availability and high scalability in big data environment.

Design/methodology/approach

First, the integration of Apache Hive, Cloudera Impala and BDAS Shark make the platform support SQL-like query. Next, users can access a single interface and select the best performance of big data warehouse platform automatically by the proposed optimizer. Finally, the distributed memory storage system Memcached incorporated into the distributed file system, Apache HDFS, is employed for fast caching query results. Therefore, if users query the same SQL command, the same result responds rapidly from the cache system instead of suffering the repeated searches in a big data warehouse and taking a longer time to retrieve.

Findings

As a result the proposed approach significantly improves the overall performance and dramatically reduces the search time as querying a database, especially applying for the high-repeatable SQL commands under multi-user mode.

Research limitations/implications

Currently, Shark’s latest stable version 0.9.1 does not support the latest versions of Spark and Hive. In addition, this series of software only supports Oracle JDK7. Using Oracle JDK8 or Open JDK will cause serious errors, and some software will be unable to run.

Practical implications

The problem with this system is that some blocks are missing when too many blocks are stored in one result (about 100,000 records). Another problem is that the sequential writing into In-memory cache wastes time.

Originality/value

When the remaining memory capacity is 2 GB or less on each server, Impala and Shark will have a lot of page swapping, causing extremely low performance. When the data scale is larger, it may cause the JVM I/O exception and make the program crash. However, when the remaining memory capacity is sufficient, Shark is faster than Hive and Impala. Impala’s consumption of memory resources is between those of Shark and Hive. This amount of remaining memory is sufficient for Impala’s maximum performance. In this study, each server allocates 20 GB of memory for cluster computing and sets the amount of remaining memory as Level 1: 3 percent (0.6 GB), Level 2: 15 percent (3 GB) and Level 3: 75 percent (15 GB) as the critical points. The program automatically selects Hive when memory is less than 15 percent, Impala at 15 to 75 percent and Shark at more than 75 percent.

Details

Engineering Computations, vol. 33 no. 6

Type: Research Article

DOI:

ISSN: 0264-4401

Keywords

Access

Year

Content type

Article (75194)

1 – 10 of over 75000

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Abstract

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Details

Abstract

Details

Keywords

Abstract

Details

Keywords

Abstract

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Access

Year

Content type

We’re listening — tell us what you think

Something didn’t work…

All feedback is valuable

Join us on our journey

Platform update page

Questions & More Information