Search results

1 – 10 of 10
Open Access
Article
Publication date: 10 June 2024

Lua Thi Trinh

The purpose of this paper is to compare nine different models to evaluate consumer credit risk, which are the following: Logistic Regression (LR), Naive Bayes (NB), Linear…

Abstract

Purpose

The purpose of this paper is to compare nine different models to evaluate consumer credit risk, which are the following: Logistic Regression (LR), Naive Bayes (NB), Linear Discriminant Analysis (LDA), k-Nearest Neighbor (k-NN), Support Vector Machine (SVM), Classification and Regression Tree (CART), Artificial Neural Network (ANN), Random Forest (RF) and Gradient Boosting Decision Tree (GBDT) in Peer-to-Peer (P2P) Lending.

Design/methodology/approach

The author uses data from P2P Lending Club (LC) to assess the efficiency of a variety of classification models across different economic scenarios and to compare the ranking results of credit risk models in P2P lending through three families of evaluation metrics.

Findings

The results from this research indicate that the risk classification models in the 2013–2019 economic period show greater measurement efficiency than for the difficult 2007–2012 period. Besides, the results of ranking models for predicting default risk show that GBDT is the best model for most of the metrics or metric families included in the study. The findings of this study also support the results of Tsai et al. (2014) and Teplý and Polena (2019) that LR, ANN and LDA models classify loan applications quite stably and accurately, while CART, k-NN and NB show the worst performance when predicting borrower default risk on P2P loan data.

Originality/value

The main contributions of the research to the empirical literature review include: comparing nine prediction models of consumer loan application risk through statistical and machine learning algorithms evaluated by the performance measures according to three separate families of metrics (threshold, ranking and probabilistic metrics) that are consistent with the existing data characteristics of the LC lending platform through two periods of reviewing the current economic situation and platform development.

Details

Journal of Economics, Finance and Administrative Science, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2077-1886

Keywords

Open Access
Article
Publication date: 27 November 2023

Reshmy Krishnan, Shantha Kumari, Ali Al Badi, Shermina Jeba and Menila James

Students pursuing different professional courses at the higher education level during 2021–2022 saw the first-time occurrence of a pandemic in the form of coronavirus disease 2019…

Abstract

Purpose

Students pursuing different professional courses at the higher education level during 2021–2022 saw the first-time occurrence of a pandemic in the form of coronavirus disease 2019 (COVID-19), and their mental health was affected. Many works are available in the literature to assess mental health severity. However, it is necessary to identify the affected students early for effective treatment.

Design/methodology/approach

Predictive analytics, a part of machine learning (ML), helps with early identification based on mental health severity levels to aid clinical psychologists. As a case study, engineering and medical course students were comparatively analysed in this work as they have rich course content and a stricter evaluation process than other streams. The methodology includes an online survey that obtains demographic details, academic qualifications, family details, etc. and anxiety and depression questions using the Hospital Anxiety and Depression Scale (HADS). The responses acquired through social media networks are analysed using ML algorithms – support vector machines (SVMs) (robust handling of health information) and J48 decision tree (DT) (interpretability/comprehensibility). Also, random forest is used to identify the predictors for anxiety and depression.

Findings

The results show that the support vector classifier produces outperforming results with classification accuracy of 100%, 1.0 precision and 1.0 recall, followed by the J48 DT classifier with 96%. It was found that medical students are affected by anxiety and depression marginally more when compared with engineering students.

Research limitations/implications

The entire work is dependent on the social media-displayed online questionnaire, and the participants were not met in person. This indicates that the response rate could not be evaluated appropriately. Due to the medical restrictions imposed by COVID-19, which remain in effect in 2022, this is the only method found to collect primary data from college students. Additionally, students self-selected themselves to participate in this survey, which raises the possibility of selection bias.

Practical implications

The responses acquired through social media networks are analysed using ML algorithms. This will be a big support for understanding the mental issues of the students due to COVID-19 and can taking appropriate actions to rectify them. This will improve the quality of the learning process in higher education in Oman.

Social implications

Furthermore, this study aims to provide recommendations for mental health screening as a regular practice in educational institutions to identify undetected students.

Originality/value

Comparing the mental health issues of two professional course students is the novelty of this work. This is needed because both studies require practical learning, long hours of work, etc.

Details

Arab Gulf Journal of Scientific Research, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1985-9899

Keywords

Open Access
Article
Publication date: 9 May 2022

Khalid Iqbal and Muhammad Shehrayar Khan

In this digital era, email is the most pervasive form of communication between people. Many users become a victim of spam emails and their data have been exposed.

11110

Abstract

Purpose

In this digital era, email is the most pervasive form of communication between people. Many users become a victim of spam emails and their data have been exposed.

Design/methodology/approach

Researchers contribute to solving this problem by a focus on advanced machine learning algorithms and improved models for detecting spam emails but there is still a gap in features. To achieve good results, features also play an important role. To evaluate the performance of applied classifiers, 10-fold cross-validation is used.

Findings

The results approve that the spam emails are correctly classified with the accuracy of 98.00% for the Support Vector Machine and 98.06% for the Artificial Neural Network as compared to other applied machine learning classifiers.

Originality/value

In this paper, Point-Biserial correlation is applied to each feature concerning the class label of the University of California Irvine (UCI) spambase email dataset to select the best features. Extensive experiments are conducted on selected features by training the different classifiers.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Open Access
Article
Publication date: 8 May 2024

Behzad Maleki Vishkaei and Pietro De Giovanni

This paper aims to use Bayesian network (BN) methodology complemented by machine learning (ML) and what-if analysis to investigate the impact of digital technologies (DT) on…

Abstract

Purpose

This paper aims to use Bayesian network (BN) methodology complemented by machine learning (ML) and what-if analysis to investigate the impact of digital technologies (DT) on logistics service quality (LSQ), employing the service quality (SERVQUAL) framework.

Design/methodology/approach

Using a sample of 244 Italian firms, this study estimates the probability distributions associated with both DT and SERVQUAL logistics, as well as their interrelationships. Additionally, BN technique enables the application of ML techniques to uncover hidden relationships, as well as a series of what-if analyses to extract more knowledge.

Findings

The results show that the average probability of firms investing in DT for analytics (DTA) is higher than that of investing inDT for immersive experiences (DTIE). Furthermore, adopting both offers only a moderate likelihood of successfully implementing SERVQUAL logistics. Additionally, certain technologies may not directly influence some SERVQUAL dimensions. The application of ML reveals hidden relationships among technologies, enhancing the predictions of SERVQUAL logistics. Finally, what-if analyses provide further insights to guide decision-making processes aimed at enhancing SERVQUAL logistics dimensions through DTA and DTIE.

Originality/value

This research delves into the influence of DTIE and DTA on SERVQUAL logistics, thereby filling a gap in the existing literature in which no study has explored the intricate relationships between these technologies and SERVQUAL dimensions. Methodologically, we pioneer the integration of BN with ML techniques and what-if analysis, thus exploring innovative techniques to be used in logistics and supply-chain studies.

Details

International Journal of Physical Distribution & Logistics Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0960-0035

Keywords

Open Access
Article
Publication date: 15 August 2024

Jing Zou, Martin Odening and Ostap Okhrin

This paper aims to improve the delimitation of plant growth stages in the context of weather index insurance design. We propose a data-driven phase division that minimizes…

Abstract

Purpose

This paper aims to improve the delimitation of plant growth stages in the context of weather index insurance design. We propose a data-driven phase division that minimizes estimation errors in the weather-yield relationship and investigate whether it can substitute an expert-based determination of plant growth phases. We combine this procedure with various statistical and machine learning estimation methods and compare their performance.

Design/methodology/approach

Using the example of winter barley, we divide the complete growth cycle into four sub-phases based on phenology reports and expert instructions and evaluate all combinations of start and end points of the various growth stages by their estimation errors of the respective yield models. Some of the most commonly used statistical and machine learning methods are employed to model the weather-yield relationship with each selected method we applied.

Findings

Our results confirm that the fit of crop-yield models can be improved by disaggregation of the vegetation period. Moreover, we find that the data-driven approach leads to similar division points as the expert-based approach. Regarding the statistical model, in terms of yield model prediction accuracy, Support Vector Machine ranks first and Polynomial Regression last; however, the performance across different methods exhibits only minor differences.

Originality/value

This research addresses the challenge of separating plant growth stages when phenology information is unavailable. Moreover, it evaluates the performance of statistical and machine learning methods in the context of crop yield prediction. The suggested phase-division in conjunction with advanced statistical methods offers promising avenues for improving weather index insurance design.

Details

Agricultural Finance Review, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0002-1466

Keywords

Open Access
Article
Publication date: 14 December 2021

Mariam Elhussein and Samiha Brahimi

This paper aims to propose a novel way of using textual clustering as a feature selection method. It is applied to identify the most important keywords in the profile…

Abstract

Purpose

This paper aims to propose a novel way of using textual clustering as a feature selection method. It is applied to identify the most important keywords in the profile classification. The method is demonstrated through the problem of sick-leave promoters on Twitter.

Design/methodology/approach

Four machine learning classifiers were used on a total of 35,578 tweets posted on Twitter. The data were manually labeled into two categories: promoter and nonpromoter. Classification performance was compared when the proposed clustering feature selection approach and the standard feature selection were applied.

Findings

Radom forest achieved the highest accuracy of 95.91% higher than similar work compared. Furthermore, using clustering as a feature selection method improved the Sensitivity of the model from 73.83% to 98.79%. Sensitivity (recall) is the most important measure of classifier performance when detecting promoters’ accounts that have spam-like behavior.

Research limitations/implications

The method applied is novel, more testing is needed in other datasets before generalizing its results.

Practical implications

The model applied can be used by Saudi authorities to report on the accounts that sell sick-leaves online.

Originality/value

The research is proposing a new way textual clustering can be used in feature selection.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Open Access
Article
Publication date: 13 August 2021

Habeeb Balogun, Hafiz Alaka and Christian Nnaemeka Egwim

This paper seeks to assess the performance levels of BA-GS-LSSVM compared to popular standalone algorithms used to build NO2 prediction models. The purpose of this paper is to…

1308

Abstract

Purpose

This paper seeks to assess the performance levels of BA-GS-LSSVM compared to popular standalone algorithms used to build NO2 prediction models. The purpose of this paper is to pre-process a relatively large data of NO2 from Internet of Thing (IoT) sensors with time-corresponding weather and traffic data and to use the data to develop NO2 prediction models using BA-GS-LSSVM and popular standalone algorithms to allow for a fair comparison.

Design/methodology/approach

This research installed and used data from 14 IoT emission sensors to develop machine learning predictive models for NO2 pollution concentration. The authors used big data analytics infrastructure to retrieve the large volume of data collected in tens of seconds for over 5 months. Weather data from the UK meteorology department and traffic data from the department for transport were collected and merged for the corresponding time and location where the pollution sensors exist.

Findings

The results show that the hybrid BA-GS-LSSVM outperforms all other standalone machine learning predictive Model for NO2 pollution.

Practical implications

This paper's hybrid model provides a basis for giving an informed decision on the NO2 pollutant avoidance system.

Originality/value

This research installed and used data from 14 IoT emission sensors to develop machine learning predictive models for NO2 pollution concentration.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Open Access
Article
Publication date: 17 October 2023

Abdelhadi Ifleh and Mounime El Kabbouri

The prediction of stock market (SM) indices is a fascinating task. An in-depth analysis in this field can provide valuable information to investors, traders and policy makers in…

1369

Abstract

Purpose

The prediction of stock market (SM) indices is a fascinating task. An in-depth analysis in this field can provide valuable information to investors, traders and policy makers in attractive SMs. This article aims to apply a correlation feature selection model to identify important technical indicators (TIs), which are combined with multiple deep learning (DL) algorithms for forecasting SM indices.

Design/methodology/approach

The methodology involves using a correlation feature selection model to select the most relevant features. These features are then used to predict the fluctuations of six markets using various DL algorithms, and the results are compared with predictions made using all features by using a range of performance measures.

Findings

The experimental results show that the combination of TIs selected through correlation and Artificial Neural Network (ANN) provides good results in the MADEX market. The combination of selected indicators and Convolutional Neural Network (CNN) in the NASDAQ 100 market outperforms all other combinations of variables and models. In other markets, the combination of all variables with ANN provides the best results.

Originality/value

This article makes several significant contributions, including the use of a correlation feature selection model to select pertinent variables, comparison between multiple DL algorithms (ANN, CNN and Long-Short-Term Memory (LSTM)), combining selected variables with algorithms to improve predictions, evaluation of the suggested model on six datasets (MASI, MADEX, FTSE 100, SP500, NASDAQ 100 and EGX 30) and application of various performance measures (Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error(RMSE), Mean Squared Logarithmic Error (MSLE) and Root Mean Squared Logarithmic Error (RMSLE)).

Details

Arab Gulf Journal of Scientific Research, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1985-9899

Keywords

Open Access
Article
Publication date: 4 March 2022

Modeste Meliho, Abdellatif Khattabi, Zejli Driss and Collins Ashianga Orlando

The purpose of the paper is to predict mapping of areas vulnerable to flooding in the Ourika watershed in the High Atlas of Morocco with the aim of providing a useful tool capable…

1627

Abstract

Purpose

The purpose of the paper is to predict mapping of areas vulnerable to flooding in the Ourika watershed in the High Atlas of Morocco with the aim of providing a useful tool capable of helping in the mitigation and management of floods in the associated region, as well as Morocco as a whole.

Design/methodology/approach

Four machine learning (ML) algorithms including k-nearest neighbors (KNN), artificial neural network, random forest (RF) and x-gradient boost (XGB) are adopted for modeling. Additionally, 16 predictors divided into categorical and numerical variables are used as inputs for modeling.

Findings

The results showed that RF and XGB were the best performing algorithms, with AUC scores of 99.1 and 99.2%, respectively. Conversely, KNN had the lowest predictive power, scoring 94.4%. Overall, the algorithms predicted that over 60% of the watershed was in the very low flood risk class, while the high flood risk class accounted for less than 15% of the area.

Originality/value

There are limited, if not non-existent studies on modeling using AI tools including ML in the region in predictive modeling of flooding, making this study intriguing.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Open Access
Article
Publication date: 1 August 2024

Tianyu Pan and Rachel J.C. Fu

This study aims to evaluate Artificial Intelligence (AI) research in the hospitality industry based on the service AI framework (mechanical-thinking-feeling) and highlight…

Abstract

Purpose

This study aims to evaluate Artificial Intelligence (AI) research in the hospitality industry based on the service AI framework (mechanical-thinking-feeling) and highlight prospective avenues for future inquiry in this growing domain.

Design/methodology/approach

This paper conceptualizes timely concepts supported by research spanning multiple domains.

Findings

This research introduces a novel classification for the domain of AI hospitality research. This classification encompasses prediction and pattern recognition, computer vision, NLP, behavioral research, and synthetic data generation. Based on this classification, this study identifies and elaborates upon five emerging research topics, each linked to a corresponding set of research questions. These focal points encompass the realms of interpretable AI, controllable AI, AI ethics, collaborative AI, and synthetic data generation.

Originality/value

This viewpoint provides a foundational framework and a directional compass for future research in AI within the hospitality industry. It pushes the industry forward with a balanced approach to leveraging AI to augment human potential and enrich customer experiences. Both the classification and the research agenda would contribute to the body of knowledge that will guide the industry toward a future where technology and human service coalesce to create unparalleled value for all stakeholders.

Details

International Hospitality Review, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2516-8142

Keywords

Access

Only content I have access to

Year

Content type

Earlycite article (10)
1 – 10 of 10