Search results

1 – 10 of over 32000
Article
Publication date: 23 November 2022

Ibrahim Karatas and Abdulkadir Budak

The study is aimed to compare the prediction success of basic machine learning and ensemble machine learning models and accordingly create novel prediction models by combining…

Abstract

Purpose

The study is aimed to compare the prediction success of basic machine learning and ensemble machine learning models and accordingly create novel prediction models by combining machine learning models to increase the prediction success in construction labor productivity prediction models.

Design/methodology/approach

Categorical and numerical data used in prediction models in many studies in the literature for the prediction of construction labor productivity were made ready for analysis by preprocessing. The Python programming language was used to develop machine learning models. As a result of many variation trials, the models were combined and the proposed novel voting and stacking meta-ensemble machine learning models were constituted. Finally, the models were compared to Target and Taylor diagram.

Findings

Meta-ensemble models have been developed for labor productivity prediction by combining machine learning models. Voting ensemble by combining et, gbm, xgboost, lightgbm, catboost and mlp models and stacking ensemble by combining et, gbm, xgboost, catboost and mlp models were created and finally the Et model as meta-learner was selected. Considering the prediction success, it has been determined that the voting and stacking meta-ensemble algorithms have higher prediction success than other machine learning algorithms. Model evaluation metrics, namely MAE, MSE, RMSE and R2, were selected to measure the prediction success. For the voting meta-ensemble algorithm, the values of the model evaluation metrics MAE, MSE, RMSE and R2 are 0.0499, 0.0045, 0.0671 and 0.7886, respectively. For the stacking meta-ensemble algorithm, the values of the model evaluation metrics MAE, MSE, RMSE and R2 are 0.0469, 0.0043, 0.0658 and 0.7967, respectively.

Research limitations/implications

The study shows the comparison between machine learning algorithms and created novel meta-ensemble machine learning algorithms to predict the labor productivity of construction formwork activity. The practitioners and project planners can use this model as reliable and accurate tool for predicting the labor productivity of construction formwork activity prior to construction planning.

Originality/value

The study provides insight into the application of ensemble machine learning algorithms in predicting construction labor productivity. Additionally, novel meta-ensemble algorithms have been used and proposed. Therefore, it is hoped that predicting the labor productivity of construction formwork activity with high accuracy will make a great contribution to construction project management.

Details

Engineering, Construction and Architectural Management, vol. 31 no. 3
Type: Research Article
ISSN: 0969-9988

Keywords

Open Access
Article
Publication date: 31 July 2023

Daniel Šandor and Marina Bagić Babac

Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning…

2864

Abstract

Purpose

Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning. It is mainly distinguished by the inflection with which it is spoken, with an undercurrent of irony, and is largely dependent on context, which makes it a difficult task for computational analysis. Moreover, sarcasm expresses negative sentiments using positive words, allowing it to easily confuse sentiment analysis models. This paper aims to demonstrate the task of sarcasm detection using the approach of machine and deep learning.

Design/methodology/approach

For the purpose of sarcasm detection, machine and deep learning models were used on a data set consisting of 1.3 million social media comments, including both sarcastic and non-sarcastic comments. The data set was pre-processed using natural language processing methods, and additional features were extracted and analysed. Several machine learning models, including logistic regression, ridge regression, linear support vector and support vector machines, along with two deep learning models based on bidirectional long short-term memory and one bidirectional encoder representations from transformers (BERT)-based model, were implemented, evaluated and compared.

Findings

The performance of machine and deep learning models was compared in the task of sarcasm detection, and possible ways of improvement were discussed. Deep learning models showed more promise, performance-wise, for this type of task. Specifically, a state-of-the-art model in natural language processing, namely, BERT-based model, outperformed other machine and deep learning models.

Originality/value

This study compared the performance of the various machine and deep learning models in the task of sarcasm detection using the data set of 1.3 million comments from social media.

Details

Information Discovery and Delivery, vol. 52 no. 2
Type: Research Article
ISSN: 2398-6247

Keywords

Article
Publication date: 6 February 2023

Marko Kureljusic and Jonas Metz

The accurate prediction of incoming cash flows enables more effective cash management and allows firms to shape firms' planning based on forward-looking information. Although most…

Abstract

Purpose

The accurate prediction of incoming cash flows enables more effective cash management and allows firms to shape firms' planning based on forward-looking information. Although most firms are aware of the benefits of these forecasts, many still have difficulties identifying and implementing an appropriate prediction model. With the rise of machine learning algorithms, numerous new forecasting techniques have emerged. These new forecasting techniques are theoretically applicable for predicting customer payment behavior but have not yet been adequately investigated. This study aims to close this research gap by examining which machine learning algorithm is the most appropriate for predicting customer payment dates.

Design/methodology/approach

By using various machine learning algorithms, the authors evaluate whether customer payment behavior patterns can be identified and predicted. The study is based on real-world transaction data from a DAX-40 firm with over 1,000,000 invoices in the dataset, with the data covering the period 2017–2019.

Findings

The authors' results show that neural networks in particular are suitable for predicting customers' payment dates. Furthermore, the authors demonstrate that contextual and logical prediction models can provide more accurate forecasts than conventional baseline models, such as linear and multivariate regression.

Research limitations/implications

Future cash flow forecasting studies should incorporate naïve prediction models, as the authors demonstrate that these models can compete with conventional baseline models used in existing machine learning research. However, the authors expect that with more in-depth information about the customer (creditworthiness, accounting structure) the results can be even further improved.

Practical implications

The knowledge of customers' future payment dates enables firms to change their perspective and move from reactive to proactive cash management. This shift leads to a more targeted dunning process.

Originality/value

To the best of the authors' knowledge, no study has yet been conducted that interprets the prediction of incoming payments as a daily rolling forecast by comparing naïve forecasts with forecasts based on machine learning and deep learning models.

Details

Journal of Applied Accounting Research, vol. 24 no. 4
Type: Research Article
ISSN: 0967-5426

Keywords

Open Access
Article
Publication date: 14 July 2022

Karlo Puh and Marina Bagić Babac

As the tourism industry becomes more vital for the success of many economies around the world, the importance of technology in tourism grows daily. Alongside increasing tourism…

5942

Abstract

Purpose

As the tourism industry becomes more vital for the success of many economies around the world, the importance of technology in tourism grows daily. Alongside increasing tourism importance and popularity, the amount of significant data grows, too. On daily basis, millions of people write their opinions, suggestions and views about accommodation, services, and much more on various websites. Well-processed and filtered data can provide a lot of useful information that can be used for making tourists' experiences much better and help us decide when selecting a hotel or a restaurant. Thus, the purpose of this study is to explore machine and deep learning models for predicting sentiment and rating from tourist reviews.

Design/methodology/approach

This paper used machine learning models such as Naïve Bayes, support vector machines (SVM), convolutional neural network (CNN), long short-term memory (LSTM) and bidirectional long short-term memory (BiLSTM) for extracting sentiment and ratings from tourist reviews. These models were trained to classify reviews into positive, negative, or neutral sentiment, and into one to five grades or stars. Data used for training the models were gathered from TripAdvisor, the world's largest travel platform. The models based on multinomial Naïve Bayes (MNB) and SVM were trained using the term frequency-inverse document frequency (TF-IDF) for word representations while deep learning models were trained using global vectors (GloVe) for word representation. The results from testing these models are presented, compared and discussed.

Findings

The performance of machine and learning models achieved high accuracy in predicting positive, negative, or neutral sentiments and ratings from tourist reviews. The optimal model architecture for both classification tasks was a deep learning model based on BiLSTM. The study’s results confirmed that deep learning models are more efficient and accurate than machine learning algorithms.

Practical implications

The proposed models allow for forecasting the number of tourist arrivals and expenditure, gaining insights into the tourists' profiles, improving overall customer experience, and upgrading marketing strategies. Different service sectors can use the implemented models to get insights into customer satisfaction with the products and services as well as to predict the opinions given a particular context.

Originality/value

This study developed and compared different machine learning models for classifying customer reviews as positive, negative, or neutral, as well as predicting ratings with one to five stars based on a TripAdvisor hotel reviews dataset that contains 20,491 unique hotel reviews.

Details

Journal of Hospitality and Tourism Insights, vol. 6 no. 3
Type: Research Article
ISSN: 2514-9792

Keywords

Article
Publication date: 30 December 2022

Aishwarya Narang, Ravi Kumar and Amit Dhiman

This study seeks to understand the connection of methodology by finding relevant papers and their full review using the “Preferred Reporting Items for Systematic Reviews and…

Abstract

Purpose

This study seeks to understand the connection of methodology by finding relevant papers and their full review using the “Preferred Reporting Items for Systematic Reviews and Meta-Analyses” (PRISMA).

Design/methodology/approach

Concrete-filled steel tubular (CFST) columns have gained popularity in construction in recent decades as they offer the benefit of constituent materials and cost-effectiveness. Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Gene Expression Programming (GEP) and Decision Trees (DTs) are some of the approaches that have been widely used in recent decades in structural engineering to construct predictive models, resulting in effective and accurate decision making. Despite the fact that there are numerous research studies on the various parameters that influence the axial compression capacity (ACC) of CFST columns, there is no systematic review of these Machine Learning methods.

Findings

The implications of a variety of structural characteristics on machine learning performance parameters are addressed and reviewed. The comparison analysis of current design codes and machine learning tools to predict the performance of CFST columns is summarized. The discussion results indicate that machine learning tools better understand complex datasets and intricate testing designs.

Originality/value

This study examines machine learning techniques for forecasting the axial bearing capacity of concrete-filled steel tubular (CFST) columns. This paper also highlights the drawbacks of utilizing existing techniques to build CFST columns, and the benefits of Machine Learning approaches over them. This article attempts to introduce beginners and experienced professionals to various research trajectories.

Details

Multidiscipline Modeling in Materials and Structures, vol. 19 no. 2
Type: Research Article
ISSN: 1573-6105

Keywords

Article
Publication date: 16 August 2021

Rajshree Varma, Yugandhara Verma, Priya Vijayvargiya and Prathamesh P. Churi

The rapid advancement of technology in online communication and fingertip access to the Internet has resulted in the expedited dissemination of fake news to engage a global…

1406

Abstract

Purpose

The rapid advancement of technology in online communication and fingertip access to the Internet has resulted in the expedited dissemination of fake news to engage a global audience at a low cost by news channels, freelance reporters and websites. Amid the coronavirus disease 2019 (COVID-19) pandemic, individuals are inflicted with these false and potentially harmful claims and stories, which may harm the vaccination process. Psychological studies reveal that the human ability to detect deception is only slightly better than chance; therefore, there is a growing need for serious consideration for developing automated strategies to combat fake news that traverses these platforms at an alarming rate. This paper systematically reviews the existing fake news detection technologies by exploring various machine learning and deep learning techniques pre- and post-pandemic, which has never been done before to the best of the authors’ knowledge.

Design/methodology/approach

The detailed literature review on fake news detection is divided into three major parts. The authors searched papers no later than 2017 on fake news detection approaches on deep learning and machine learning. The papers were initially searched through the Google scholar platform, and they have been scrutinized for quality. The authors kept “Scopus” and “Web of Science” as quality indexing parameters. All research gaps and available databases, data pre-processing, feature extraction techniques and evaluation methods for current fake news detection technologies have been explored, illustrating them using tables, charts and trees.

Findings

The paper is dissected into two approaches, namely machine learning and deep learning, to present a better understanding and a clear objective. Next, the authors present a viewpoint on which approach is better and future research trends, issues and challenges for researchers, given the relevance and urgency of a detailed and thorough analysis of existing models. This paper also delves into fake new detection during COVID-19, and it can be inferred that research and modeling are shifting toward the use of ensemble approaches.

Originality/value

The study also identifies several novel automated web-based approaches used by researchers to assess the validity of pandemic news that have proven to be successful, although currently reported accuracy has not yet reached consistent levels in the real world.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 14 no. 4
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 31 January 2022

Simone Massulini Acosta and Angelo Marcio Oliveira Sant'Anna

Process monitoring is a way to manage the quality characteristics of products in manufacturing processes. Several process monitoring based on machine learning algorithms have been…

Abstract

Purpose

Process monitoring is a way to manage the quality characteristics of products in manufacturing processes. Several process monitoring based on machine learning algorithms have been proposed in the literature and have gained the attention of many researchers. In this paper, the authors developed machine learning-based control charts for monitoring fraction non-conforming products in smart manufacturing. This study proposed a relevance vector machine using Bayesian sparse kernel optimized by differential evolution algorithm for efficient monitoring in manufacturing.

Design/methodology/approach

A new approach was carried out about data analysis, modelling and monitoring in the manufacturing industry. This study developed a relevance vector machine using Bayesian sparse kernel technique to improve the support vector machine used to both regression and classification problems. The authors compared the performance of proposed relevance vector machine with other machine learning algorithms, such as support vector machine, artificial neural network and beta regression model. The proposed approach was evaluated by different shift scenarios of average run length using Monte Carlo simulation.

Findings

The authors analyse a real case study in a manufacturing company, based on best machine learning algorithms. The results indicate that proposed relevance vector machine-based process monitoring are excellent quality tools for monitoring defective products in manufacturing process. A comparative analysis with four machine learning models is used to evaluate the performance of the proposed approach. The relevance vector machine has slightly better performance than support vector machine, artificial neural network and beta models.

Originality/value

This research is different from the others by providing approaches for monitoring defective products. Machine learning-based control charts are used to monitor product failures in smart manufacturing process. Besides, the key contribution of this study is to develop different models for fault detection and to identify any change point in the manufacturing process. Moreover, the authors’ research indicates that machine learning models are adequate tools for the modelling and monitoring of the fraction non-conforming product in the industrial process.

Details

International Journal of Quality & Reliability Management, vol. 40 no. 3
Type: Research Article
ISSN: 0265-671X

Keywords

Article
Publication date: 17 March 2023

Stewart Jones

This study updates the literature review of Jones (1987) published in this journal. The study pays particular attention to two important themes that have shaped the field over the…

Abstract

Purpose

This study updates the literature review of Jones (1987) published in this journal. The study pays particular attention to two important themes that have shaped the field over the past 35 years: (1) the development of a range of innovative new statistical learning methods, particularly advanced machine learning methods such as stochastic gradient boosting, adaptive boosting, random forests and deep learning, and (2) the emergence of a wide variety of bankruptcy predictor variables extending beyond traditional financial ratios, including market-based variables, earnings management proxies, auditor going concern opinions (GCOs) and corporate governance attributes. Several directions for future research are discussed.

Design/methodology/approach

This study provides a systematic review of the corporate failure literature over the past 35 years with a particular focus on the emergence of new statistical learning methodologies and predictor variables. This synthesis of the literature evaluates the strength and limitations of different modelling approaches under different circumstances and provides an overall evaluation the relative contribution of alternative predictor variables. The study aims to provide a transparent, reproducible and interpretable review of the literature. The literature review also takes a theme-centric rather than author-centric approach and focuses on structured themes that have dominated the literature since 1987.

Findings

There are several major findings of this study. First, advanced machine learning methods appear to have the most promise for future firm failure research. Not only do these methods predict significantly better than conventional models, but they also possess many appealing statistical properties. Second, there are now a much wider range of variables being used to model and predict firm failure. However, the literature needs to be interpreted with some caution given the many mixed findings. Finally, there are still a number of unresolved methodological issues arising from the Jones (1987) study that still requiring research attention.

Originality/value

The study explains the connections and derivations between a wide range of firm failure models, from simpler linear models to advanced machine learning methods such as gradient boosting, random forests, adaptive boosting and deep learning. The paper highlights the most promising models for future research, particularly in terms of their predictive power, underlying statistical properties and issues of practical implementation. The study also draws together an extensive literature on alternative predictor variables and provides insights into the role and behaviour of alternative predictor variables in firm failure research.

Details

Journal of Accounting Literature, vol. 45 no. 2
Type: Research Article
ISSN: 0737-4607

Keywords

Article
Publication date: 2 February 2022

Deepak Suresh Asudani, Naresh Kumar Nagwani and Pradeep Singh

Classifying emails as ham or spam based on their content is essential. Determining the semantic and syntactic meaning of words and putting them in a high-dimensional feature…

370

Abstract

Purpose

Classifying emails as ham or spam based on their content is essential. Determining the semantic and syntactic meaning of words and putting them in a high-dimensional feature vector form for processing is the most difficult challenge in email categorization. The purpose of this paper is to examine the effectiveness of the pre-trained embedding model for the classification of emails using deep learning classifiers such as the long short-term memory (LSTM) model and convolutional neural network (CNN) model.

Design/methodology/approach

In this paper, global vectors (GloVe) and Bidirectional Encoder Representations Transformers (BERT) pre-trained word embedding are used to identify relationships between words, which helps to classify emails into their relevant categories using machine learning and deep learning models. Two benchmark datasets, SpamAssassin and Enron, are used in the experimentation.

Findings

In the first set of experiments, machine learning classifiers, the support vector machine (SVM) model, perform better than other machine learning methodologies. The second set of experiments compares the deep learning model performance without embedding, GloVe and BERT embedding. The experiments show that GloVe embedding can be helpful for faster execution with better performance on large-sized datasets.

Originality/value

The experiment reveals that the CNN model with GloVe embedding gives slightly better accuracy than the model with BERT embedding and traditional machine learning algorithms to classify an email as ham or spam. It is concluded that the word embedding models improve email classifiers accuracy.

Details

Data Technologies and Applications, vol. 56 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Open Access
Article
Publication date: 25 October 2019

Ning Yan and Oliver Tat-Sheung Au

The purpose of this paper is to make a correlation analysis between students’ online learning behavior features and course grade, and to attempt to build some effective prediction…

7677

Abstract

Purpose

The purpose of this paper is to make a correlation analysis between students’ online learning behavior features and course grade, and to attempt to build some effective prediction model based on limited data.

Design/methodology/approach

The prediction label in this paper is the course grade of students, and the eigenvalues available are student age, student gender, connection time, hits count and days of access. The machine learning model used in this paper is the classical three-layer feedforward neural networks, and the scaled conjugate gradient algorithm is adopted. Pearson correlation analysis method is used to find the relationships between course grade and the student eigenvalues.

Findings

Days of access has the highest correlation with course grade, followed by hits count, and connection time is less relevant to students’ course grade. Student age and gender have the lowest correlation with course grade. Binary classification models have much higher prediction accuracy than multi-class classification models. Data normalization and data discretization can effectively improve the prediction accuracy of machine learning models, such as ANN model in this paper.

Originality/value

This paper may help teachers to find some clue to identify students with learning difficulties in advance and give timely help through the online learning behavior data. It shows that acceptable prediction models based on machine learning can be built using a small and limited data set. However, introducing external data into machine learning models to improve its prediction accuracy is still a valuable and hard issue.

Details

Asian Association of Open Universities Journal, vol. 14 no. 2
Type: Research Article
ISSN: 2414-6994

Keywords

1 – 10 of over 32000