Search results

1 – 10 of 511
Article
Publication date: 20 July 2023

Mu Shengdong, Liu Yunjie and Gu Jijian

By introducing Stacking algorithm to solve the underfitting problem caused by insufficient data in traditional machine learning, this paper provides a new solution to the cold…

Abstract

Purpose

By introducing Stacking algorithm to solve the underfitting problem caused by insufficient data in traditional machine learning, this paper provides a new solution to the cold start problem of entrepreneurial borrowing risk control.

Design/methodology/approach

The authors introduce semi-supervised learning and integrated learning into the field of migration learning, and innovatively propose the Stacking model migration learning, which can independently train models on entrepreneurial borrowing credit data, and then use the migration strategy itself as the learning object, and use the Stacking algorithm to combine the prediction results of the source domain model and the target domain model.

Findings

The effectiveness of the two migration learning models is evaluated with real data from an entrepreneurial borrowing. The algorithmic performance of the Stacking-based model migration learning is further improved compared to the benchmark model without migration learning techniques, with the model area under curve value rising to 0.8. Comparing the two migration learning models reveals that the model-based migration learning approach performs better. The reason for this is that the sample-based migration learning approach only eliminates the noisy samples that are relatively less similar to the entrepreneurial borrowing data. However, the calculation of similarity and the weighing of similarity are subjective, and there is no unified judgment standard and operation method, so there is no guarantee that the retained traditional credit samples have the same sample distribution and feature structure as the entrepreneurial borrowing data.

Practical implications

From a practical standpoint, on the one hand, it provides a new solution to the cold start problem of entrepreneurial borrowing risk control. The small number of labeled high-quality samples cannot support the learning and deployment of big data risk control models, which is the cold start problem of the entrepreneurial borrowing risk control system. By extending the training sample set with auxiliary domain data through suitable migration learning methods, the prediction performance of the model can be improved to a certain extent and more generalized laws can be learned.

Originality/value

This paper introduces the thought method of migration learning to the entrepreneurial borrowing scenario, provides a new solution to the cold start problem of the entrepreneurial borrowing risk control system and verifies the feasibility and effectiveness of the migration learning method applied in the risk control field through empirical data.

Details

Management Decision, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0025-1747

Keywords

Article
Publication date: 21 December 2023

Majid Rahi, Ali Ebrahimnejad and Homayun Motameni

Taking into consideration the current human need for agricultural produce such as rice that requires water for growth, the optimal consumption of this valuable liquid is…

Abstract

Purpose

Taking into consideration the current human need for agricultural produce such as rice that requires water for growth, the optimal consumption of this valuable liquid is important. Unfortunately, the traditional use of water by humans for agricultural purposes contradicts the concept of optimal consumption. Therefore, designing and implementing a mechanized irrigation system is of the highest importance. This system includes hardware equipment such as liquid altimeter sensors, valves and pumps which have a failure phenomenon as an integral part, causing faults in the system. Naturally, these faults occur at probable time intervals, and the probability function with exponential distribution is used to simulate this interval. Thus, before the implementation of such high-cost systems, its evaluation is essential during the design phase.

Design/methodology/approach

The proposed approach included two main steps: offline and online. The offline phase included the simulation of the studied system (i.e. the irrigation system of paddy fields) and the acquisition of a data set for training machine learning algorithms such as decision trees to detect, locate (classification) and evaluate faults. In the online phase, C5.0 decision trees trained in the offline phase were used on a stream of data generated by the system.

Findings

The proposed approach is a comprehensive online component-oriented method, which is a combination of supervised machine learning methods to investigate system faults. Each of these methods is considered a component determined by the dimensions and complexity of the case study (to discover, classify and evaluate fault tolerance). These components are placed together in the form of a process framework so that the appropriate method for each component is obtained based on comparison with other machine learning methods. As a result, depending on the conditions under study, the most efficient method is selected in the components. Before the system implementation phase, its reliability is checked by evaluating the predicted faults (in the system design phase). Therefore, this approach avoids the construction of a high-risk system. Compared to existing methods, the proposed approach is more comprehensive and has greater flexibility.

Research limitations/implications

By expanding the dimensions of the problem, the model verification space grows exponentially using automata.

Originality/value

Unlike the existing methods that only examine one or two aspects of fault analysis such as fault detection, classification and fault-tolerance evaluation, this paper proposes a comprehensive process-oriented approach that investigates all three aspects of fault analysis concurrently.

Details

International Journal of Intelligent Computing and Cybernetics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 9 January 2024

Ning Chen, Zhenyu Zhang and An Chen

Consequence prediction is an emerging topic in safety management concerning the severity outcome of accidents. In practical applications, it is usually implemented through…

Abstract

Purpose

Consequence prediction is an emerging topic in safety management concerning the severity outcome of accidents. In practical applications, it is usually implemented through supervised learning methods; however, the evaluation of classification results remains a challenge. The previous studies mostly adopted simplex evaluation based on empirical and quantitative assessment strategies. This paper aims to shed new light on the comprehensive evaluation and comparison of diverse classification methods through visualization, clustering and ranking techniques.

Design/methodology/approach

An empirical study is conducted using 9 state-of-the-art classification methods on a real-world data set of 653 construction accidents in China for predicting the consequence with respect to 39 carefully featured factors and accident type. The proposed comprehensive evaluation enriches the interpretation of classification results from different perspectives. Furthermore, the critical factors leading to severe construction accidents are identified by analyzing the coefficients of a logistic regression model.

Findings

This paper identifies the critical factors that significantly influence the consequence of construction accidents, which include accident type (particularly collapse), improper accident reporting and handling (E21), inadequate supervision engineers (O41), no special safety department (O11), delayed or low-quality drawings (T11), unqualified contractor (C21), schedule pressure (C11), multi-level subcontracting (C22), lacking safety examination (S22), improper operation of mechanical equipment (R11) and improper construction procedure arrangement (T21). The prediction models and findings of critical factors help make safety intervention measures in a targeted way and enhance the experience of safety professionals in the construction industry.

Research limitations/implications

The empirical study using some well-known classification methods for forecasting the consequences of construction accidents provides some evidence for the comprehensive evaluation of multiple classifiers. These techniques can be used jointly with other evaluation approaches for a comprehensive understanding of the classification algorithms. Despite the limitation of specific methods used in the study, the presented methodology can be configured with other classification methods and performance metrics and even applied to other decision-making problems such as clustering.

Originality/value

This study sheds new light on the comprehensive comparison and evaluation of classification results through visualization, clustering and ranking techniques using an empirical study of consequence prediction of construction accidents. The relevance of construction accident type is discussed with the severity of accidents. The critical factors influencing the accident consequence are identified for the sake of taking prevention measures for risk reduction. The proposed method can be applied to other decision-making tasks where the evaluation is involved as an important component.

Details

Construction Innovation , vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1471-4175

Keywords

Article
Publication date: 10 April 2024

Aslıhan Dursun-Cengizci and Meltem Caber

This study aims to predict customer churn in resort hotels by calculating the churn probability of repeat customers for future stays in the same hotel brand.

71

Abstract

Purpose

This study aims to predict customer churn in resort hotels by calculating the churn probability of repeat customers for future stays in the same hotel brand.

Design/methodology/approach

Based on the recency, frequency, monetary (RFM) paradigm, random forest and logistic regression supervised machine learning algorithms were used to predict churn behavior. The model with superior performance was used to detect potential churners and generate a priority matrix.

Findings

The random forest algorithm showed a higher prediction performance with an 80% accuracy rate. The most important variables were RFM-based, followed by hotel sector-specific variables such as market, season, accompaniers and booker. Some managerial strategies were proposed to retain future churners, clustered as “hesitant,” “economy,” “alternative seeker,” and “opportunity chaser” customer groups.

Research limitations/implications

This study contributes to the theoretical understanding of customer behavior in the hospitality industry and provides valuable insight for hotel practitioners by demonstrating the methods that facilitate the identification of potential churners and their characteristics.

Originality/value

Most customer retention studies in hospitality either concentrate on the antecedents of retention or customers’ revisit intentions using traditional methods. Taking a unique place within the literature, this study conducts churn prediction analysis for repeat hotel customers by opening a new area for inquiry in hospitality studies.

Details

International Journal of Contemporary Hospitality Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0959-6119

Keywords

Article
Publication date: 12 September 2023

Sumei Yao, Fan Wang, Jing Chen and Quan Lu

Social media texts as a data source in depression research have emerged as a significant convergence between Information Management and Public Health in recent years. This paper…

Abstract

Purpose

Social media texts as a data source in depression research have emerged as a significant convergence between Information Management and Public Health in recent years. This paper aims to sort out the depression-related study conducted on the text on social media, with particular attention to the research theme and methods.

Design/methodology/approach

The authors finally selected research articles published in Web of Science, Wiley, ACM Digital Library, EBSCO, IEEE Xplore and JMIR databases, covering 57 articles.

Findings

(1) According to the coding results, Depression Prediction and Linguistic Characteristics and Information Behavior are the two most popular themes. The theme of Patient Needs has progressed over the past few years. Still, there is a lesser focus on Stigma and Antidepressants. (2) Researchers prefer quantitative methods such as machine learning and statistical analysis to qualitative ones. (4) According to the analysis of the data collection platforms, more researchers used comprehensive social media sites like Reddit and Facebook than depression-specific communities like Sunforum and Alonelylife.

Practical implications

The authors recommend employing machine learning and statistical analysis to explore factors related to Stigmatization and Antidepressants thoroughly. Additionally, conducting mixed-methods studies incorporating data from diverse sources would be valuable. Such approaches would provide insights beneficial to policymakers and pharmaceutical companies seeking a comprehensive understanding of depression.

Originality/value

This article signifies a pioneering effort in systematically gathering and examining the themes and methodologies within the intersection of health-related texts on social media and depression.

Details

Library Hi Tech, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0737-8831

Keywords

Open Access
Article
Publication date: 23 November 2023

Reema Khaled AlRowais and Duaa Alsaeed

Automatically extracting stance information from natural language texts is a significant research problem with various applications, particularly after the recent explosion of…

244

Abstract

Purpose

Automatically extracting stance information from natural language texts is a significant research problem with various applications, particularly after the recent explosion of data on the internet via platforms like social media sites. Stance detection system helps determine whether the author agree, against or has a neutral opinion with the given target. Most of the research in stance detection focuses on the English language, while few research was conducted on the Arabic language.

Design/methodology/approach

This paper aimed to address stance detection on Arabic tweets by building and comparing different stance detection models using four transformers, namely: Araelectra, MARBERT, AraBERT and Qarib. Using different weights for these transformers, the authors performed extensive experiments fine-tuning the task of stance detection Arabic tweets with the four different transformers.

Findings

The results showed that the AraBERT model learned better than the other three models with a 70% F1 score followed by the Qarib model with a 68% F1 score.

Research limitations/implications

A limitation of this study is the imbalanced dataset and the limited availability of annotated datasets of SD in Arabic.

Originality/value

Provide comprehensive overview of the current resources for stance detection in the literature, including datasets and machine learning methods used. Therefore, the authors examined the models to analyze and comprehend the obtained findings in order to make recommendations for the best performance models for the stance detection task.

Details

Arab Gulf Journal of Scientific Research, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1985-9899

Keywords

Article
Publication date: 18 August 2023

Gaurav Sarin, Pradeep Kumar and M. Mukund

Text classification is a widely accepted and adopted technique in organizations to mine and analyze unstructured and semi-structured data. With advancement of technological…

Abstract

Purpose

Text classification is a widely accepted and adopted technique in organizations to mine and analyze unstructured and semi-structured data. With advancement of technological computing, deep learning has become more popular among academicians and professionals to perform mining and analytical operations. In this work, the authors study the research carried out in field of text classification using deep learning techniques to identify gaps and opportunities for doing research.

Design/methodology/approach

The authors adopted bibliometric-based approach in conjunction with visualization techniques to uncover new insights and findings. The authors collected data of two decades from Scopus global database to perform this study. The authors discuss business applications of deep learning techniques for text classification.

Findings

The study provides overview of various publication sources in field of text classification and deep learning together. The study also presents list of prominent authors and their countries working in this field. The authors also presented list of most cited articles based on citations and country of research. Various visualization techniques such as word cloud, network diagram and thematic map were used to identify collaboration network.

Originality/value

The study performed in this paper helped to understand research gaps that is original contribution to body of literature. To best of the authors' knowledge, in-depth study in the field of text classification and deep learning has not been performed in detail. The study provides high value to scholars and professionals by providing them opportunities of research in this area.

Details

Benchmarking: An International Journal, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1463-5771

Keywords

Article
Publication date: 8 August 2023

Smita Abhijit Ganjare, Sunil M. Satao and Vaibhav Narwane

In today's fast developing era, the volume of data is increasing day by day. The traditional methods are lagging for efficiently managing the huge amount of data. The adoption of…

Abstract

Purpose

In today's fast developing era, the volume of data is increasing day by day. The traditional methods are lagging for efficiently managing the huge amount of data. The adoption of machine learning techniques helps in efficient management of data and draws relevant patterns from that data. The main aim of this research paper is to provide brief information about the proposed adoption of machine learning techniques in different sectors of manufacturing supply chain.

Design/methodology/approach

This research paper has done rigorous systematic literature review of adoption of machine learning techniques in manufacturing supply chain from year 2015 to 2023. Out of 511 papers, 74 papers are shortlisted for detailed analysis.

Findings

The papers are subcategorised into 8 sections which helps in scrutinizing the work done in manufacturing supply chain. This paper helps in finding out the contribution of application of machine learning techniques in manufacturing field mostly in automotive sector.

Practical implications

The research is limited to papers published from year 2015 to year 2023. The limitation of the current research that book chapters, unpublished work, white papers and conference papers are not considered for study. Only English language articles and review papers are studied in brief. This study helps in adoption of machine learning techniques in manufacturing supply chain.

Originality/value

This study is one of the few studies which investigate machine learning techniques in manufacturing sector and supply chain through systematic literature survey.

Highlights

  1. A comprehensive understanding of Machine Learning techniques is presented.

  2. The state of art of adoption of Machine Learning techniques are investigated.

  3. The methodology of (SLR) is proposed.

  4. An innovative study of Machine Learning techniques in manufacturing supply chain.

A comprehensive understanding of Machine Learning techniques is presented.

The state of art of adoption of Machine Learning techniques are investigated.

The methodology of (SLR) is proposed.

An innovative study of Machine Learning techniques in manufacturing supply chain.

Details

The TQM Journal, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1754-2731

Keywords

Article
Publication date: 21 November 2023

Armin Mahmoodi, Leila Hashemi and Milad Jasemi

In this study, the central objective is to foresee stock market signals with the use of a proper structure to achieve the highest accuracy possible. For this purpose, three hybrid…

Abstract

Purpose

In this study, the central objective is to foresee stock market signals with the use of a proper structure to achieve the highest accuracy possible. For this purpose, three hybrid models have been developed for the stock markets which are a combination of support vector machine (SVM) with meta-heuristic algorithms of particle swarm optimization (PSO), imperialist competition algorithm (ICA) and genetic algorithm (GA).All the analyses are technical and are based on the Japanese candlestick model.

Design/methodology/approach

Further as per the results achieved, the most suitable algorithm is chosen to anticipate sell and buy signals. Moreover, the authors have compared the results of the designed model validations in this study with basic models in three articles conducted in the past years. Therefore, SVM is examined by PSO. It is used as a classification agent to search the problem-solving space precisely and at a faster pace. With regards to the second model, SVM and ICA are tested to stock market timing, in a way that ICA is used as an optimization agent for the SVM parameters. At last, in the third model, SVM and GA are studied, where GA acts as an optimizer and feature selection agent.

Findings

As per the results, it is observed that all new models can predict accurately for only 6 days; however, in comparison with the confusion matrix results, it is observed that the SVM-GA and SVM-ICA models have correctly predicted more sell signals, and the SCM-PSO model has correctly predicted more buy signals. However, SVM-ICA has shown better performance than other models considering executing the implemented models.

Research limitations/implications

In this study, the data for stock market of the years 2013–2021 were analyzed; the long length of timeframe makes the input data analysis challenging as they must be moderated with respect to the conditions where they have been changed.

Originality/value

In this study, two methods have been developed in a candlestick model; they are raw-based and signal-based approaches in which the hit rate is determined by the percentage of correct evaluations of the stock market for a 16-day period.

Details

EuroMed Journal of Business, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1450-2194

Keywords

Article
Publication date: 11 July 2023

Abhinandan Chatterjee, Pradip Bala, Shruti Gedam, Sanchita Paul and Nishant Goyal

Depression is a mental health problem characterized by a persistent sense of sadness and loss of interest. EEG signals are regarded as the most appropriate instruments for…

Abstract

Purpose

Depression is a mental health problem characterized by a persistent sense of sadness and loss of interest. EEG signals are regarded as the most appropriate instruments for diagnosing depression because they reflect the operating status of the human brain. The purpose of this study is the early detection of depression among people using EEG signals.

Design/methodology/approach

(i) Artifacts are removed by filtering and linear and non-linear features are extracted; (ii) feature scaling is done using a standard scalar while principal component analysis (PCA) is used for feature reduction; (iii) the linear, non-linear and combination of both (only for those whose accuracy is highest) are taken for further analysis where some ML and DL classifiers are applied for the classification of depression; and (iv) in this study, total 15 distinct ML and DL methods, including KNN, SVM, bagging SVM, RF, GB, Extreme Gradient Boosting, MNB, Adaboost, Bagging RF, BootAgg, Gaussian NB, RNN, 1DCNN, RBFNN and LSTM, that have been effectively utilized as classifiers to handle a variety of real-world issues.

Findings

1. Among all, alpha, alpha asymmetry, gamma and gamma asymmetry give the best results in linear features, while RWE, DFA, CD and AE give the best results in non-linear feature. 2. In the linear features, gamma and alpha asymmetry have given 99.98% accuracy for Bagging RF, while gamma asymmetry has given 99.98% accuracy for BootAgg. 3. For non-linear features, it has been shown 99.84% of accuracy for RWE and DFA in RF, 99.97% accuracy for DFA in XGBoost and 99.94% accuracy for RWE in BootAgg. 4. By using DL, in linear features, gamma asymmetry has given more than 96% accuracy in RNN and 91% accuracy in LSTM and for non-linear features, 89% accuracy has been achieved for CD and AE in LSTM. 5. By combining linear and non-linear features, the highest accuracy was achieved in Bagging RF (98.50%) gamma asymmetry + RWE. In DL, Alpha + RWE, Gamma asymmetry + CD and gamma asymmetry + RWE have achieved 98% accuracy in LSTM.

Originality/value

A novel dataset was collected from the Central Institute of Psychiatry (CIP), Ranchi which was recorded using a 128-channels whereas major previous studies used fewer channels; the details of the study participants are summarized and a model is developed for statistical analysis using N-way ANOVA; artifacts are removed by high and low pass filtering of epoch data followed by re-referencing and independent component analysis for noise removal; linear features, namely, band power and interhemispheric asymmetry and non-linear features, namely, relative wavelet energy, wavelet entropy, Approximate entropy, sample entropy, detrended fluctuation analysis and correlation dimension are extracted; this model utilizes Epoch (213,072) for 5 s EEG data, which allows the model to train for longer, thereby increasing the efficiency of classifiers. Features scaling is done using a standard scalar rather than normalization because it helps increase the accuracy of the models (especially for deep learning algorithms) while PCA is used for feature reduction; the linear, non-linear and combination of both features are taken for extensive analysis in conjunction with ML and DL classifiers for the classification of depression. The combination of linear and non-linear features (only for those whose accuracy is highest) is used for the best detection results.

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

Access

Year

Last 12 months (511)

Content type

Earlycite article (511)
1 – 10 of 511