Search results

1 – 10 of 139

View access options

Article

Publication date: 4 April 2022

Prediction of polarities of online hotel reviews: an improved stacked decision tree (ISD) approach

Shrawan Kumar Trivedi, Amrinder Singh and Somesh Kumar Malhotra

There is a need to predict whether the consumers liked the stay in the hotel rooms or not, and to remove the aspects the customers did not like. Many customers leave a review…

HTML

PDF (598 KB)

Downloads

179

Abstract

Purpose

There is a need to predict whether the consumers liked the stay in the hotel rooms or not, and to remove the aspects the customers did not like. Many customers leave a review after staying in the hotel. These reviews are mostly given on the website used to book the hotel. These reviews can be considered as a valuable data, which can be analyzed to provide better services in the hotels. The purpose of this study is to use machine learning techniques for analyzing the given data to determine different sentiment polarities of the consumers.

Design/methodology/approach

Reviews given by hotel customers on the Tripadvisor website, which were made available publicly by Kaggle. Out of 10,000 reviews in the data, a sample of 3,000 negative polarity reviews (customers with bad experiences) in the hotel and 3,000 positive polarity reviews (customers with good experiences) in the hotel is taken to prepare data set. The two-stage feature selection was applied, which first involved greedy selection method and then wrapper method to generate 37 most relevant features. An improved stacked decision tree (ISD) classifier) is built, which is further compared with state-of-the-art machine learning algorithms. All the tests are done using R-Studio.

Findings

The results showed that the new model was satisfactory overall with 80.77% accuracy after doing in-depth study with 50–50 split, 80.74% accuracy for 66–34 split and 80.25% accuracy for 80–20 split, when predicting the nature of the customers’ experience in the hotel, i.e. whether they are positive or negative.

Research limitations/implications

The implication of this research is to provide a showcase of how we can predict the polarity of potentially popular reviews. This helps the authors’ perspective to help the hotel industries to take corrective measures for the betterment of business and to promote useful positive reviews. This study also has some limitations like only English reviews are considered. This study was restricted to the data from trip-adviser website; however, a new data may be generated to test the credibility of the model. Only aspect-based sentiment classification is considered in this study.

Originality/value

Stacking machine learning techniques have been proposed. At first, state-of-the-art classifiers are tested on the given data, and then, three best performing classifiers (decision tree C5.0, random forest and support vector machine) are taken to build stack and to create ISD classifier.

Details

Global Knowledge, Memory and Communication, vol. 72 no. 8/9

Type: Research Article

DOI:

ISSN: 2514-9342

Keywords

View access options

Article

Publication date: 21 December 2023

Evaluation of predicted fault tolerance based on C5.0 decision tree algorithm in irrigation system of paddy fields

Majid Rahi, Ali Ebrahimnejad and Homayun Motameni

Taking into consideration the current human need for agricultural produce such as rice that requires water for growth, the optimal consumption of this valuable liquid is…

HTML

PDF (9.4 MB)

Downloads

Abstract

Purpose

Taking into consideration the current human need for agricultural produce such as rice that requires water for growth, the optimal consumption of this valuable liquid is important. Unfortunately, the traditional use of water by humans for agricultural purposes contradicts the concept of optimal consumption. Therefore, designing and implementing a mechanized irrigation system is of the highest importance. This system includes hardware equipment such as liquid altimeter sensors, valves and pumps which have a failure phenomenon as an integral part, causing faults in the system. Naturally, these faults occur at probable time intervals, and the probability function with exponential distribution is used to simulate this interval. Thus, before the implementation of such high-cost systems, its evaluation is essential during the design phase.

Design/methodology/approach

The proposed approach included two main steps: offline and online. The offline phase included the simulation of the studied system (i.e. the irrigation system of paddy fields) and the acquisition of a data set for training machine learning algorithms such as decision trees to detect, locate (classification) and evaluate faults. In the online phase, C5.0 decision trees trained in the offline phase were used on a stream of data generated by the system.

Findings

The proposed approach is a comprehensive online component-oriented method, which is a combination of supervised machine learning methods to investigate system faults. Each of these methods is considered a component determined by the dimensions and complexity of the case study (to discover, classify and evaluate fault tolerance). These components are placed together in the form of a process framework so that the appropriate method for each component is obtained based on comparison with other machine learning methods. As a result, depending on the conditions under study, the most efficient method is selected in the components. Before the system implementation phase, its reliability is checked by evaluating the predicted faults (in the system design phase). Therefore, this approach avoids the construction of a high-risk system. Compared to existing methods, the proposed approach is more comprehensive and has greater flexibility.

Research limitations/implications

By expanding the dimensions of the problem, the model verification space grows exponentially using automata.

Originality/value

Unlike the existing methods that only examine one or two aspects of fault analysis such as fault detection, classification and fault-tolerance evaluation, this paper proposes a comprehensive process-oriented approach that investigates all three aspects of fault analysis concurrently.

Details

International Journal of Intelligent Computing and Cybernetics, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 1756-378X

Keywords

View access options

Article

Publication date: 23 April 2024

Did disruptive events affect the purchase of private label food products?

Annarita Colamatteo, Marcello Sansone and Giuliano Iorio

This paper aims to examine the impact of the COVID-19 pandemic on the private label food products, specifically assessing the stability and changes in factors influencing…

HTML

PDF (835 KB)

Downloads

Abstract

Purpose

This paper aims to examine the impact of the COVID-19 pandemic on the private label food products, specifically assessing the stability and changes in factors influencing purchasing decisions, and comparing pre-pandemic and post-pandemic datasets.

Design/methodology/approach

The study employs the Extra Tree Classifier method, a robust quantitative approach, to analyse data collected from questionnaires distributed among two distinct consumer samples. This methodological choice is explicitly adopted to provide a clear classification of factors influencing consumer preferences for private label products, surpassing conventional qualitative methods.

Findings

Despite the profound disruptions caused by the COVID-19 pandemic, this research underscores the persistent hierarchy of factors shaping consumer choices in the private label food market, showing an overall stability in consumer behaviour. At the same time, the analysis of individual variables highlights the positive increase in those related to product quality, health, taste, and communication.

Research limitations/implications

The use of online surveys for data collection may introduce a self-selection bias, and the non-probabilistic sampling method could limit the generalizability of the results.

Practical implications

Practical implications suggest that managers in the private label industry should prioritize enhancing quality control, ensuring effective communication, and dynamically adapting strategies to meet evolving consumer preferences, with a particular emphasis on quality and health attributes.

Originality/value

This study contributes to the existing body of literature by providing insights into the profound transformations induced by the COVID-19 pandemic on consumer behaviour, specifically in relation to their preferences for private label food products.

Details

British Food Journal, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 0007-070X

Keywords

View access options

Article

Publication date: 4 May 2023

Risk assessment in machine learning enhanced failure mode and effects analysis

Zeping Wang, Hengte Du, Liangyan Tao and Saad Ahmed Javed

The traditional failure mode and effect analysis (FMEA) has some limitations, such as the neglect of relevant historical data, subjective use of rating numbering and the less…

HTML

PDF (354 KB)

Downloads

177

Abstract

Purpose

The traditional failure mode and effect analysis (FMEA) has some limitations, such as the neglect of relevant historical data, subjective use of rating numbering and the less rationality and accuracy of the Risk Priority Number. The current study proposes a machine learning–enhanced FMEA (ML-FMEA) method based on a popular machine learning tool, Waikato environment for knowledge analysis (WEKA).

Design/methodology/approach

This work uses the collected FMEA historical data to predict the probability of component/product failure risk by machine learning based on different commonly used classifiers. To compare the correct classification rate of ML-FMEA based on different classifiers, the 10-fold cross-validation is employed. Moreover, the prediction error is estimated by repeated experiments with different random seeds under varying initialization settings. Finally, the case of the submersible pump in Bhattacharjee et al. (2020) is utilized to test the performance of the proposed method.

Findings

The results show that ML-FMEA, based on most of the commonly used classifiers, outperforms the Bhattacharjee model. For example, the ML-FMEA based on Random Committee improves the correct classification rate from 77.47 to 90.09 per cent and area under the curve of receiver operating characteristic curve (ROC) from 80.9 to 91.8 per cent, respectively.

Originality/value

The proposed method not only enables the decision-maker to use the historical failure data and predict the probability of the risk of failure but also may pave a new way for the application of machine learning techniques in FMEA.

Details

Data Technologies and Applications, vol. 58 no. 1

Type: Research Article

DOI:

ISSN: 2514-9288

Keywords

Open Access

Article

Publication date: 20 November 2023

Foreign direct investment and local interpretable model-agnostic explanations: a rational framework for FDI decision making

Devesh Singh

This study aims to examine foreign direct investment (FDI) factors and develops a rational framework for FDI inflow in Western European countries such as France, Germany, the…

HTML

PDF (1.8 MB)

Downloads

380

Abstract

Purpose

This study aims to examine foreign direct investment (FDI) factors and develops a rational framework for FDI inflow in Western European countries such as France, Germany, the Netherlands, Switzerland, Belgium and Austria.

Design/methodology/approach

Data for this study were collected from the World development indicators (WDI) database from 1995 to 2018. Factors such as economic growth, pollution, trade, domestic capital investment, gross value-added and the financial stability of the country that influence FDI decisions were selected through empirical literature. A framework was developed using interpretable machine learning (IML), decision trees and three-stage least squares simultaneous equation methods for FDI inflow in Western Europe.

Findings

The findings of this study show that there is a difference between the most important and trusted factors for FDI inflow. Additionally, this study shows that machine learning (ML) models can perform better than conventional linear regression models.

Research limitations/implications

This research has several limitations. Ideally, classification accuracies should be higher, and the current scope of this research is limited to examining the performance of FDI determinants within Western Europe.

Practical implications

Through this framework, the national government can understand how investors make their capital allocation decisions in their country. The framework developed in this study can help policymakers better understand the rationality of FDI inflows.

Originality/value

An IML framework has not been developed in prior studies to analyze FDI inflows. Additionally, the author demonstrates the applicability of the IML framework for estimating FDI inflows in Western Europe.

Details

Journal of Economics, Finance and Administrative Science, vol. 29 no. 57

Type: Research Article

DOI:

ISSN: 2077-1886

Keywords

View access options

Article

Publication date: 6 October 2023

A comprehensive analysis for classification and regression of surface points based on geodesics and machine learning algorithms

Vahide Bulut

Feature extraction from 3D datasets is a current problem. Machine learning is an important tool for classification of complex 3D datasets. Machine learning classification…

HTML

PDF (1.9 MB)

Downloads

Abstract

Purpose

Feature extraction from 3D datasets is a current problem. Machine learning is an important tool for classification of complex 3D datasets. Machine learning classification techniques are widely used in various fields, such as text classification, pattern recognition, medical disease analysis, etc. The aim of this study is to apply the most popular classification and regression methods to determine the best classification and regression method based on the geodesics.

Design/methodology/approach

The feature vector is determined by the unit normal vector and the unit principal vector at each point of the 3D surface along with the point coordinates themselves. Moreover, different examples are compared according to the classification methods in terms of accuracy and the regression algorithms in terms of R-squared value.

Findings

Several surface examples are analyzed for the feature vector using classification (31 methods) and regression (23 methods) machine learning algorithms. In addition, two ensemble methods XGBoost and LightGBM are used for classification and regression. Also, the scores for each surface example are compared.

Originality/value

To the best of the author’s knowledge, this is the first study to analyze datasets based on geodesics using machine learning algorithms for classification and regression.

Details

Engineering Computations, vol. 40 no. 9/10

Type: Research Article

DOI:

ISSN: 0264-4401

Keywords

View access options

Article

Publication date: 3 April 2024

Creditworthiness pattern prediction and detection for GCC Islamic banks using machine learning techniques

Samar Shilbayeh and Rihab Grassa

Bank creditworthiness refers to the evaluation of a bank’s ability to meet its financial obligations. It is an assessment of the bank’s financial health, stability and capacity to…

HTML

PDF (1.6 MB)

Downloads

Abstract

Purpose

Bank creditworthiness refers to the evaluation of a bank’s ability to meet its financial obligations. It is an assessment of the bank’s financial health, stability and capacity to manage risks. This paper aims to investigate the credit rating patterns that are crucial for assessing creditworthiness of the Islamic banks, thereby evaluating the stability of their industry.

Design/methodology/approach

Three distinct machine learning algorithms are exploited and evaluated for the desired objective. This research initially uses the decision tree machine learning algorithm as a base learner conducting an in-depth comparison with the ensemble decision tree and Random Forest. Subsequently, the Apriori algorithm is deployed to uncover the most significant attributes impacting a bank’s credit rating. To appraise the previously elucidated models, a ten-fold cross-validation method is applied. This method involves segmenting the data sets into ten folds, with nine used for training and one for testing alternatively ten times changeable. This approach aims to mitigate any potential biases that could arise during the learning and training phases. Following this process, the accuracy is assessed and depicted in a confusion matrix as outlined in the methodology section.

Findings

The findings of this investigation reveal that the Random Forest machine learning algorithm superperforms others, achieving an impressive 90.5% accuracy in predicting credit ratings. Notably, our research sheds light on the significance of the loan-to-deposit ratio as a primary attribute affecting credit rating predictions. Moreover, this study uncovers additional pivotal banking features that intensely impact the measurements under study. This paper’s findings provide evidence that the loan-to-deposit ratio looks to be the purest bank attribute that affects credit rating prediction. In addition, deposit-to-assets ratio and profit sharing investment account ratio criteria are found to be effective in credit rating prediction and the ownership structure criterion came to be viewed as one of the essential bank attributes in credit rating prediction.

Originality/value

These findings contribute significant evidence to the understanding of attributes that strongly influence credit rating predictions within the banking sector. This study uniquely contributes by uncovering patterns that have not been previously documented in the literature, broadening our understanding in this field.

Details

International Journal of Islamic and Middle Eastern Finance and Management, vol. 17 no. 2

Type: Research Article

DOI:

ISSN: 1753-8394

Keywords

Content available

Article

Publication date: 6 November 2023

Machine learning for sustainable development: leveraging technology for a greener future

Muneza Kagzi, Sayantan Khanra and Sanjoy Kumar Paul

From a technological determinist perspective, machine learning (ML) may significantly contribute towards sustainable development. The purpose of this study is to synthesize prior…

HTML

PDF (1.2 MB)

Downloads

262

Abstract

Purpose

From a technological determinist perspective, machine learning (ML) may significantly contribute towards sustainable development. The purpose of this study is to synthesize prior literature on the role of ML in promoting sustainability and to encourage future inquiries.

Design/methodology/approach

This study conducts a systematic review of 110 papers that demonstrate the utilization of ML in the context of sustainable development.

Findings

ML techniques may play a vital role in enabling sustainable development by leveraging data to uncover patterns and facilitate the prediction of various variables, thereby aiding in decision-making processes. Through the synthesis of findings from prior research, it is evident that ML may help in achieving many of the United Nations’ sustainable development goals.

Originality/value

This study represents one of the initial investigations that conducted a comprehensive examination of the literature concerning ML’s contribution to sustainability. The analysis revealed that the research domain is still in its early stages, indicating a need for further exploration.

Details

Journal of Systems and Information Technology, vol. 25 no. 4

Type: Research Article

DOI:

ISSN: 1328-7265

Keywords

View access options

Article

Publication date: 17 April 2024

Default prediction modeling (DPM) with machine learning algorithms: case of non-financial listed companies in Pakistan

Jahanzaib Alvi and Imtiaz Arif

The crux of this paper is to unveil efficient features and practical tools that can predict credit default.

HTML

PDF (700 KB)

Downloads

Abstract

Purpose

The crux of this paper is to unveil efficient features and practical tools that can predict credit default.

Design/methodology/approach

Annual data of non-financial listed companies were taken from 2000 to 2020, along with 71 financial ratios. The dataset was bifurcated into three panels with three default assumptions. Logistic regression (LR) and k-nearest neighbor (KNN) binary classification algorithms were used to estimate credit default in this research.

Findings

The study’s findings revealed that features used in Model 3 (Case 3) were the efficient and best features comparatively. Results also showcased that KNN exposed higher accuracy than LR, which proves the supremacy of KNN on LR.

Research limitations/implications

Using only two classifiers limits this research for a comprehensive comparison of results; this research was based on only financial data, which exhibits a sizeable room for including non-financial parameters in default estimation. Both limitations may be a direction for future research in this domain.

Originality/value

This study introduces efficient features and tools for credit default prediction using financial data, demonstrating KNN’s superior accuracy over LR and suggesting future research directions.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 0368-492X

Keywords

Open Access

Article

Publication date: 23 November 2023

Arabic stance detection of COVID-19 vaccination using transformer-based approaches: a comparison study

Reema Khaled AlRowais and Duaa Alsaeed

Automatically extracting stance information from natural language texts is a significant research problem with various applications, particularly after the recent explosion of…

HTML

PDF (2 MB)

Downloads

229

Abstract

Purpose

Automatically extracting stance information from natural language texts is a significant research problem with various applications, particularly after the recent explosion of data on the internet via platforms like social media sites. Stance detection system helps determine whether the author agree, against or has a neutral opinion with the given target. Most of the research in stance detection focuses on the English language, while few research was conducted on the Arabic language.

Design/methodology/approach

This paper aimed to address stance detection on Arabic tweets by building and comparing different stance detection models using four transformers, namely: Araelectra, MARBERT, AraBERT and Qarib. Using different weights for these transformers, the authors performed extensive experiments fine-tuning the task of stance detection Arabic tweets with the four different transformers.

Findings

The results showed that the AraBERT model learned better than the other three models with a 70% F1 score followed by the Qarib model with a 68% F1 score.

Research limitations/implications

A limitation of this study is the imbalanced dataset and the limited availability of annotated datasets of SD in Arabic.

Originality/value

Provide comprehensive overview of the current resources for stance detection in the literature, including datasets and machine learning methods used. Therefore, the authors examined the models to analyze and comprehend the obtained findings in order to make recommendations for the best performance models for the stance detection task.

Details

Arab Gulf Journal of Scientific Research, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 1985-9899

Keywords

Access

Year

Content type

1 – 10 of 139