Search results

1 – 4 of 4

Open Access

Article

Publication date: 12 March 2018

A framework for big data analytics approach to failure prediction of construction firms

Hafiz A. Alaka, Lukumon O. Oyedele, Hakeem A. Owolabi, Muhammad Bilal, Saheed O. Ajayi and Olugbenga O. Akinade

This study explored use of big data analytics (BDA) to analyse data of a large number of construction firms to develop a construction business failure prediction model (CB-FPM)…

HTML

PDF (1.6 MB)

Downloads

988

Abstract

This study explored use of big data analytics (BDA) to analyse data of a large number of construction firms to develop a construction business failure prediction model (CB-FPM). Careful analysis of literature revealed financial ratios as the best form of variable for this problem. Because of MapReduce’s unsuitability for iteration problems involved in developing CB-FPMs, various BDA initiatives for iteration problems were identified. A BDA framework for developing CB-FPM was proposed. It was validated by using 150,000 datacells of 30,000 construction firms, artificial neural network, Amazon Elastic Compute Cloud, Apache Spark and the R software. The BDA CB-FPM was developed in eight seconds while the same process without BDA was aborted after nine hours without success. This shows the issue of not wanting to use large dataset to develop CB-FPM due to tedious duration is resolvable by applying BDA technique. The BDA CB-FPM largely outperformed an ordinary CB-FPM developed with a dataset of 200 construction firms, proving that use of larger sample size with the aid of BDA, leads to better performing CB-FPMs. The high financial and social cost associated with misclassifications (i.e. model error) thus makes adoption of BDA CB-FPMs very important for, among others, financiers, clients and policy makers.

Details

Applied Computing and Informatics, vol. 16 no. 1/2

Type: Research Article

DOI:

ISSN: 2634-1964

Keywords

Open Access

Article

Publication date: 3 August 2020

DNA short read alignment on apache spark

Maryam AlJame and Imtiaz Ahmad

The evolution of technologies has unleashed a wealth of challenges by generating massive amount of data. Recently, biological data has increased exponentially, which has…

HTML

PDF (2 MB)

Downloads

1156

Abstract

The evolution of technologies has unleashed a wealth of challenges by generating massive amount of data. Recently, biological data has increased exponentially, which has introduced several computational challenges. DNA short read alignment is an important problem in bioinformatics. The exponential growth in the number of short reads has increased the need for an ideal platform to accelerate the alignment process. Apache Spark is a cluster-computing framework that involves data parallelism and fault tolerance. In this article, we proposed a Spark-based algorithm to accelerate DNA short reads alignment problem, and it is called Spark-DNAligning. Spark-DNAligning exploits Apache Spark ’s performance optimizations such as broadcast variable, join after partitioning, caching, and in-memory computations. Spark-DNAligning is evaluated in term of performance by comparing it with SparkBWA tool and a MapReduce based algorithm called CloudBurst. All the experiments are conducted on Amazon Web Services (AWS). Results demonstrate that Spark-DNAligning outperforms both tools by providing a speedup in the range of 101–702 in aligning gigabytes of short reads to the human genome. Empirical evaluation reveals that Apache Spark offers promising solutions to DNA short reads alignment problem.

Details

Applied Computing and Informatics, vol. 19 no. 1/2

Type: Research Article

DOI:

ISSN: 2634-1964

Keywords

View access options

Article

Publication date: 6 August 2021

Performance evaluation of GPU- and cluster-computing for parallelization of compute-intensive tasks

Alexander Döschl, Max-Emanuel Keller and Peter Mandl

This paper aims to evaluate different approaches for the parallelization of compute-intensive tasks. The study compares a Java multi-threaded algorithm, distributed computing…

HTML

PDF (548 KB)

Downloads

Abstract

Purpose

This paper aims to evaluate different approaches for the parallelization of compute-intensive tasks. The study compares a Java multi-threaded algorithm, distributed computing solutions with MapReduce (Apache Hadoop) and resilient distributed data set (RDD) (Apache Spark) paradigms and a graphics processing unit (GPU) approach with Numba for compute unified device architecture (CUDA).

Design/methodology/approach

The paper uses a simple but computationally intensive puzzle as a case study for experiments. To find all solutions using brute force search, 15! permutations had to be computed and tested against the solution rules. The experimental application comprises a Java multi-threaded algorithm, distributed computing solutions with MapReduce (Apache Hadoop) and RDD (Apache Spark) paradigms and a GPU approach with Numba for CUDA. The implementations were benchmarked on Amazon-EC2 instances for performance and scalability measurements.

Findings

The comparison of the solutions with Apache Hadoop and Apache Spark under Amazon EMR showed that the processing time measured in CPU minutes with Spark was up to 30% lower, while the performance of Spark especially benefits from an increasing number of tasks. With the CUDA implementation, more than 16 times faster execution is achievable for the same price compared to the Spark solution. Apart from the multi-threaded implementation, the processing times of all solutions scale approximately linearly. Finally, several application suggestions for the different parallelization approaches are derived from the insights of this study.

Originality/value

There are numerous studies that have examined the performance of parallelization approaches. Most of these studies deal with processing large amounts of data or mathematical problems. This work, in contrast, compares these technologies on their ability to implement computationally intensive distributed algorithms.

Details

International Journal of Web Information Systems, vol. 17 no. 4

Type: Research Article

DOI:

ISSN: 1744-0084

Keywords

View access options

Article

Publication date: 30 September 2021

Analysis of barriers intensity for investment in big data analytics for sustainable manufacturing operations in post-COVID-19 pandemic era

Narender Kumar, Girish Kumar and Rajesh Kr Singh

The study presents various barriers to adopt big data analytics (BDA) for sustainable manufacturing operations (SMOs) post-coronavirus disease (COVID-19) pandemics. In this study…

HTML

PDF (456 KB)

Downloads

877

Abstract

Purpose

The study presents various barriers to adopt big data analytics (BDA) for sustainable manufacturing operations (SMOs) post-coronavirus disease (COVID-19) pandemics. In this study, 17 barriers are identified through extensive literature review and experts’ opinions for investing in BDA implementation. A questionnaire-based survey is conducted to collect responses from experts. The identified barriers are grouped into three categories with the help of factor analysis. These are organizational barriers, data management barriers and human barriers. For the quantification of barriers, the graph theory matrix approach (GTMA) is applied.

Design/methodology/approach

The study presents various barriers to adopt BDA for the SMOs post-COVID-19 pandemic. In this study, 17 barriers are identified through extensive literature review and experts’ opinions for investing in BDA implementation. A questionnaire-based survey is conducted to collect responses from experts. The identified barriers are grouped into three categories with the help of factor analysis. These are organizational barriers, data management barriers and human barriers. For the quantification of barriers, the GTMA is applied.

Findings

The study identifies barriers to investment in BDA implementation. It categorizes the barriers based on factor analysis and computes the intensity for each category of a barrier for BDA investment for SMOs. It is observed that the organizational barriers have the highest intensity whereas the human barriers have the smallest intensity.

Practical implications

This study may help organizations to take strategic decisions for investing in BDA applications for achieving one of the sustainable development goals. Organizations should prioritize their efforts first to counter the barriers under the category of organizational barriers followed by barriers in data management and human barriers.

Originality/value

The novelty of this paper is that barriers to BDA investment for SMOs in the context of Indian manufacturing organizations have been analyzed. The findings of the study will assist the professionals and practitioners in formulating policies based on the actual nature and intensity of the barriers.

Details

Journal of Enterprise Information Management, vol. 35 no. 1

Type: Research Article

DOI:

ISSN: 1741-0398

Keywords

Access

Year

All dates (4)

Content type

Article (4)

1 – 4 of 4

Search results

A framework for big data analytics approach to failure prediction of construction firms

Abstract

Details

Keywords

DNA short read alignment on apache spark

Abstract

Details

Keywords

Performance evaluation of GPU- and cluster-computing for parallelization of compute-intensive tasks

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Analysis of barriers intensity for investment in big data analytics for sustainable manufacturing operations in post-COVID-19 pandemic era

Abstract

Purpose

Design/methodology/approach

Findings

Practical implications

Originality/value

Details

Keywords

Access

Year

Content type

Something didn’t work…

All feedback is valuable

Platform update page

Questions & More Information

A framework for big data analytics approach to failure prediction of construction firms

Abstract

Details

Keywords

DNA short read alignment on apache spark

Abstract

Details

Keywords

Performance evaluation of GPU- and cluster-computing for parallelization of compute-intensive tasks

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Analysis of barriers intensity for investment in big data analytics for sustainable manufacturing operations in post-COVID-19 pandemic era

Abstract

Purpose

Design/methodology/approach

Findings

Practical implications

Originality/value

Details

Keywords

Access

Year

Content type

We’re listening — tell us what you think

Something didn’t work…

All feedback is valuable

Join us on our journey

Platform update page

Questions & More Information