Search results

1 – 10 of 53
Article
Publication date: 22 March 2024

Mohd Mustaqeem, Suhel Mustajab and Mahfooz Alam

Software defect prediction (SDP) is a critical aspect of software quality assurance, aiming to identify and manage potential defects in software systems. In this paper, we have…

Abstract

Purpose

Software defect prediction (SDP) is a critical aspect of software quality assurance, aiming to identify and manage potential defects in software systems. In this paper, we have proposed a novel hybrid approach that combines Gray Wolf Optimization with Feature Selection (GWOFS) and multilayer perceptron (MLP) for SDP. The GWOFS-MLP hybrid model is designed to optimize feature selection, ultimately enhancing the accuracy and efficiency of SDP. Gray Wolf Optimization, inspired by the social hierarchy and hunting behavior of gray wolves, is employed to select a subset of relevant features from an extensive pool of potential predictors. This study investigates the key challenges that traditional SDP approaches encounter and proposes promising solutions to overcome time complexity and the curse of the dimensionality reduction problem.

Design/methodology/approach

The integration of GWOFS and MLP results in a robust hybrid model that can adapt to diverse software datasets. This feature selection process harnesses the cooperative hunting behavior of wolves, allowing for the exploration of critical feature combinations. The selected features are then fed into an MLP, a powerful artificial neural network (ANN) known for its capability to learn intricate patterns within software metrics. MLP serves as the predictive engine, utilizing the curated feature set to model and classify software defects accurately.

Findings

The performance evaluation of the GWOFS-MLP hybrid model on a real-world software defect dataset demonstrates its effectiveness. The model achieves a remarkable training accuracy of 97.69% and a testing accuracy of 97.99%. Additionally, the receiver operating characteristic area under the curve (ROC-AUC) score of 0.89 highlights the model’s ability to discriminate between defective and defect-free software components.

Originality/value

Experimental implementations using machine learning-based techniques with feature reduction are conducted to validate the proposed solutions. The goal is to enhance SDP’s accuracy, relevance and efficiency, ultimately improving software quality assurance processes. The confusion matrix further illustrates the model’s performance, with only a small number of false positives and false negatives.

Details

International Journal of Intelligent Computing and Cybernetics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 4 May 2023

Zeping Wang, Hengte Du, Liangyan Tao and Saad Ahmed Javed

The traditional failure mode and effect analysis (FMEA) has some limitations, such as the neglect of relevant historical data, subjective use of rating numbering and the less…

Abstract

Purpose

The traditional failure mode and effect analysis (FMEA) has some limitations, such as the neglect of relevant historical data, subjective use of rating numbering and the less rationality and accuracy of the Risk Priority Number. The current study proposes a machine learning–enhanced FMEA (ML-FMEA) method based on a popular machine learning tool, Waikato environment for knowledge analysis (WEKA).

Design/methodology/approach

This work uses the collected FMEA historical data to predict the probability of component/product failure risk by machine learning based on different commonly used classifiers. To compare the correct classification rate of ML-FMEA based on different classifiers, the 10-fold cross-validation is employed. Moreover, the prediction error is estimated by repeated experiments with different random seeds under varying initialization settings. Finally, the case of the submersible pump in Bhattacharjee et al. (2020) is utilized to test the performance of the proposed method.

Findings

The results show that ML-FMEA, based on most of the commonly used classifiers, outperforms the Bhattacharjee model. For example, the ML-FMEA based on Random Committee improves the correct classification rate from 77.47 to 90.09 per cent and area under the curve of receiver operating characteristic curve (ROC) from 80.9 to 91.8 per cent, respectively.

Originality/value

The proposed method not only enables the decision-maker to use the historical failure data and predict the probability of the risk of failure but also may pave a new way for the application of machine learning techniques in FMEA.

Details

Data Technologies and Applications, vol. 58 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

Open Access
Article
Publication date: 5 October 2023

Babitha Philip and Hamad AlJassmi

To proactively draw efficient maintenance plans, road agencies should be able to forecast main road distress parameters, such as cracking, rutting, deflection and International…

Abstract

Purpose

To proactively draw efficient maintenance plans, road agencies should be able to forecast main road distress parameters, such as cracking, rutting, deflection and International Roughness Index (IRI). Nonetheless, the behavior of those parameters throughout pavement life cycles is associated with high uncertainty, resulting from various interrelated factors that fluctuate over time. This study aims to propose the use of dynamic Bayesian belief networks for the development of time-series prediction models to probabilistically forecast road distress parameters.

Design/methodology/approach

While Bayesian belief network (BBN) has the merit of capturing uncertainty associated with variables in a domain, dynamic BBNs, in particular, are deemed ideal for forecasting road distress over time due to its Markovian and invariant transition probability properties. Four dynamic BBN models are developed to represent rutting, deflection, cracking and IRI, using pavement data collected from 32 major road sections in the United Arab Emirates between 2013 and 2019. Those models are based on several factors affecting pavement deterioration, which are classified into three categories traffic factors, environmental factors and road-specific factors.

Findings

The four developed performance prediction models achieved an overall precision and reliability rate of over 80%.

Originality/value

The proposed approach provides flexibility to illustrate road conditions under various scenarios, which is beneficial for pavement maintainers in obtaining a realistic representation of expected future road conditions, where maintenance efforts could be prioritized and optimized.

Details

Construction Innovation , vol. 24 no. 1
Type: Research Article
ISSN: 1471-4175

Keywords

Article
Publication date: 19 July 2023

Gaurav Kumar, Molla Ramizur Rahman, Abhinav Rajverma and Arun Kumar Misra

This study aims to analyse the systemic risk emitted by all publicly listed commercial banks in a key emerging economy, India.

Abstract

Purpose

This study aims to analyse the systemic risk emitted by all publicly listed commercial banks in a key emerging economy, India.

Design/methodology/approach

The study makes use of the Tobias and Brunnermeier (2016) estimator to quantify the systemic risk (ΔCoVaR) that banks contribute to the system. The methodology addresses a classification problem based on the probability that a particular bank will emit high systemic risk or moderate systemic risk. The study applies machine learning models such as logistic regression, random forest (RF), neural networks and gradient boosting machine (GBM) and addresses the issue of imbalanced data sets to investigate bank’s balance sheet features and bank’s stock features which may potentially determine the factors of systemic risk emission.

Findings

The study reports that across various performance matrices, the authors find that two specifications are preferred: RF and GBM. The study identifies lag of the estimator of systemic risk, stock beta, stock volatility and return on equity as important features to explain emission of systemic risk.

Practical implications

The findings will help banks and regulators with the key features that can be used to formulate the policy decisions.

Originality/value

This study contributes to the existing literature by suggesting classification algorithms that can be used to model the probability of systemic risk emission in a classification problem setting. Further, the study identifies the features responsible for the likelihood of systemic risk.

Details

Journal of Modelling in Management, vol. 19 no. 2
Type: Research Article
ISSN: 1746-5664

Keywords

Article
Publication date: 5 April 2022

Saeed Pahlevan Sharif, Navaz Naghavi, Hassam Waheed and Kizito Uyi Ehigiamusoe

This study aims to investigate whether gender predicts financial inclusion and whether education can fill the gender gap in financial inclusion when controlling for the effects of…

Abstract

Purpose

This study aims to investigate whether gender predicts financial inclusion and whether education can fill the gender gap in financial inclusion when controlling for the effects of supply side factors of financial inclusion in low-income economies.

Design/methodology/approach

This study aims to investigate whether gender predicts financial inclusion and whether education can fill the gender gap in financial inclusion when controlling for the effects of supply side factors of financial inclusion in low-income economies.

Findings

The findings provided support for the gender gap in financial inclusion using the most basic measure of financial inclusion. However, using formal savings and access to credit, the gender gap hypothesis is not supported. Moreover, the results revealed that education reduces the gender gap in the basic form of financial inclusion. However, this study could not find any significant difference between men and women's financial inclusion in terms of saving at a bank or borrowing from a bank though men tend to save more than women informally.

Originality/value

The current study contributes to the literature by examining the role of education in the relationship between gender gap and financial inclusion when controlling for the effects of heterogeneous infrastructure and the supply side factors of financial inclusion among the selected countries.

Details

International Journal of Emerging Markets, vol. 18 no. 12
Type: Research Article
ISSN: 1746-8809

Keywords

Article
Publication date: 19 April 2024

Jitendra Gaur, Kumkum Bharti and Rahul Bajaj

Allocation of the marketing budget has become increasingly challenging due to the diverse channel exposure to customers. This study aims to enhance global marketing knowledge by…

Abstract

Purpose

Allocation of the marketing budget has become increasingly challenging due to the diverse channel exposure to customers. This study aims to enhance global marketing knowledge by introducing an ensemble attribution model to optimize marketing budget allocation for online marketing channels. As empirical research, this study demonstrates the supremacy of the ensemble model over standalone models.

Design/methodology/approach

The transactional data set for car insurance from an Indian insurance aggregator is used in this empirical study. The data set contains information from more than three million platform visitors. A robust ensemble model is created by combining results from two probabilistic models, namely, the Markov chain model and the Shapley value. These results are compared and validated with heuristic models. Also, the performances of online marketing channels and attribution models are evaluated based on the devices used (i.e. desktop vs mobile).

Findings

Channel importance charts for desktop and mobile devices are analyzed to understand the top contributing online marketing channels. Customer relationship management-emailers and Google cost per click a paid advertising is identified as the top two marketing channels for desktop and mobile channels. The research reveals that ensemble model accuracy is better than the standalone model, that is, the Markov chain model and the Shapley value.

Originality/value

To the best of the authors’ knowledge, the current research is the first of its kind to introduce ensemble modeling for solving attribution problems in online marketing. A comparison with heuristic models using different devices (desktop and mobile) offers insights into the results with heuristic models.

Details

Global Knowledge, Memory and Communication, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9342

Keywords

Article
Publication date: 24 January 2023

Hossein Motahari-Nezhad

No study has investigated the effects of different parameters on publication bias in meta-analyses using a machine learning approach. Therefore, this study aims to evaluate the…

Abstract

Purpose

No study has investigated the effects of different parameters on publication bias in meta-analyses using a machine learning approach. Therefore, this study aims to evaluate the impact of various factors on publication bias in meta-analyses.

Design/methodology/approach

An electronic questionnaire was created according to some factors extracted from the Cochrane Handbook and AMSTAR-2 tool to identify factors affecting publication bias. Twelve experts were consulted to determine their opinion on the importance of each factor. Each component was evaluated based on its content validity ratio (CVR). In total, 616 meta-analyses comprising 1893 outcomes from PubMed that assessed the presence of publication bias in their reported outcomes were randomly selected to extract their data. The multilayer perceptron (MLP) technique was used in IBM SPSS Modeler 18.0 to construct a prediction model. 70, 15 and 15% of the data were used for the model's training, testing and validation partitions.

Findings

There was a publication bias in 968 (51.14%) outcomes. The established model had an accuracy rate of 86.1%, and all pre-selected nine variables were included in the model. The results showed that the number of databases searched was the most important predictive variable (0.26), followed by the number of searches in the grey literature (0.24), search in Medline (0.17) and advanced search with numerous operators (0.13).

Practical implications

The results of this study can help clinical researchers minimize publication bias in their studies, leading to improved evidence-based medicine.

Originality/value

To the best of the author’s knowledge, this is the first study to model publication bias using machine learning.

Details

Aslib Journal of Information Management, vol. 76 no. 2
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 16 December 2022

Fatemeh Mozaffari, Marzieh Rahimi, Hamidreza Yazdani and Babak Sohrabi

This research intends to develop a model for predicting employees at a high-risk attrition and identify the most important factors affecting them.

Abstract

Purpose

This research intends to develop a model for predicting employees at a high-risk attrition and identify the most important factors affecting them.

Design/methodology/approach

In this study, using the triangulation technique of a mixed research method, the employee attrition problem is investigated by identifying its affecting factors. For that matter, data related to the human resources department of a pharmaceutical company in Iran are used. And to achieve the intended goal, advanced data mining algorithms and interviews with human resource managers are applied.

Findings

A model for predicting employees at a high-risk attrition is presented based on the gradient boosting machine algorithm with 89% accuracy. The use of the mixed research approach shows that qualitative and quantitative methods can be more effective in identifying the factors affecting employee churn or loss of staff. The results also contain a new situation arising out of the COVID-19 pandemic and remote working scenarios having impact on employee attrition. Finally, human resource policies are presented based on variables related to each of the identified factors.

Originality/value

The novel contributions of this study include real data related to a leading pharmaceutical company as well as a combination of two quantitative and qualitative methods. The hybrid approach can identify the reasons for attrition and, consequently, retention policies to benefit from the advantage of both approaches. Data mining can be useful to identify the factors, which are usually not mentioned in termination interviews, such as direct managers. On the other hand, the results obtained from termination interviews can also include features that the authors cannot identify through data mining, which are specifically related to the characteristics of the pharmaceutical industry such as building a more professional career path. From a practical perspective, since this company specializes in pharmaceutical marketing in a new way and is primarily comprised graduates, it is important to note that the churn of specialized people disperses organizational and technological know-how. On the other hand, the pharmacist community in Iran is small, and their attrition might adversely affect not only the reputation of an organization but the employer's brand as well. So, this research would help other similar firms in retaining their valuable human capital.

Details

Benchmarking: An International Journal, vol. 30 no. 10
Type: Research Article
ISSN: 1463-5771

Keywords

Article
Publication date: 19 March 2024

Thao-Trang Huynh-Cam, Long-Sheng Chen and Tzu-Chuen Lu

This study aimed to use enrollment information including demographic, family background and financial status, which can be gathered before the first semester starts, to construct…

Abstract

Purpose

This study aimed to use enrollment information including demographic, family background and financial status, which can be gathered before the first semester starts, to construct early prediction models (EPMs) and extract crucial factors associated with first-year student dropout probability.

Design/methodology/approach

The real-world samples comprised the enrolled records of 2,412 first-year students of a private university (UNI) in Taiwan. This work utilized decision trees (DT), multilayer perceptron (MLP) and logistic regression (LR) algorithms for constructing EPMs; under-sampling, random oversampling and synthetic minority over sampling technique (SMOTE) methods for solving data imbalance problems; accuracy, precision, recall, F1-score, receiver operator characteristic (ROC) curve and area under ROC curve (AUC) for evaluating constructed EPMs.

Findings

DT outperformed MLP and LR with accuracy (97.59%), precision (98%), recall (97%), F1_score (97%), and ROC-AUC (98%). The top-ranking factors comprised “student loan,” “dad occupations,” “mom educational level,” “department,” “mom occupations,” “admission type,” “school fee waiver” and “main sources of living.”

Practical implications

This work only used enrollment information to identify dropout students and crucial factors associated with dropout probability as soon as students enter universities. The extracted rules could be utilized to enhance student retention.

Originality/value

Although first-year student dropouts have gained non-stop attention from researchers in educational practices and theories worldwide, diverse previous studies utilized while-and/or post-semester factors, and/or questionnaires for predicting. These methods failed to offer universities early warning systems (EWS) and/or assist them in providing in-time assistance to dropouts, who face economic difficulties. This work provided universities with an EWS and extracted rules for early dropout prevention and intervention.

Details

Journal of Applied Research in Higher Education, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-7003

Keywords

Open Access
Article
Publication date: 15 December 2020

Felix Bongomin, Andrew P. Kyazze, Sandra Ninsiima, Ronald Olum, Gloria Nattabi, Winnie Nabakka, Rebecca Kukunda, Charles Batte, Phillip Ssekamatte, Joseph Baruch Baluku, Davis Kibirige, Stephen Cose and Irene Andia-Biraro

Background: Hyperglycemia in pregnancy (HIP) is a common medical complication during pregnancy and is associated with several short and long-term maternal-fetal consequences. We…

Abstract

Background: Hyperglycemia in pregnancy (HIP) is a common medical complication during pregnancy and is associated with several short and long-term maternal-fetal consequences. We aimed to determine the prevalence and factors associated with HIP among Ugandan women.

Methods: We consecutively enrolled eligible pregnant women attending antenatal care at Kawempe National Referral Hospital, Kampala, Uganda in September 2020. Mothers known to be living with diabetes mellitus or haemoglobinopathies and those with anemia (hemoglobin <11g/dl) were excluded. Random blood sugar (RBS) and glycated hemoglobin A1c (HbA1c) were measured on peripheral venous blood samples. HIP was defined as an HbA1c ≥5.7% with its subsets of diabetes in pregnancy (DIP) and prediabetes defined as HbA1c1c of ≥6.5% and 5.7–6.4% respectively. ROC curve analysis was performed to determine the optimum cutoff of RBS to screen for HIP.

Results: A total of 224 mothers with a mean (±SD) age 26±5 years were enrolled, most of whom were in the 2nd or 3rd trimester (94.6%, n=212) with a mean gestation age of 26.6±7.3 weeks. Prevalence of HIP was 11.2% (n=25) (95% CI: 7.7–16.0). Among the mothers with HIP, 2.2% (n=5) had DIP and 8.9% (n=20) prediabetes. Patients with HIP were older (28 years vs. 26 years, p=0.027), had previous tuberculosis (TB) contact (24% vs. 6.5%, p=0.003) and had a bigger hip circumference (107.8 (±10.4) vs. 103.3 (±9.7) cm, p=0.032). However only previous TB contact was predictive of HIP (odds ratio: 4.4, 95% CI: 1.2–14.0; p=0.022). Using HbA1c as a reference variable, we derived an optimum RBS cutoff of 4.75 mmol/L as predictive of HIP with a sensitivity and specificity of 90.7% and 56.4% (area under the curve=0.75 (95% CI: 0.70–0.80, p<0.001)), respectively.

Conclusions: HIP is common among young Ugandan women, the majority of whom are without identifiable risk factors.

Details

Emerald Open Research, vol. 1 no. 2
Type: Research Article
ISSN: 2631-3952

Keywords

1 – 10 of 53