Search results
1 – 10 of 549Abdel Latef M. Anouze and Imad Bou-Hamad
This paper aims to assess the application of seven statistical and data mining techniques to second-stage data envelopment analysis (DEA) for bank performance.
Abstract
Purpose
This paper aims to assess the application of seven statistical and data mining techniques to second-stage data envelopment analysis (DEA) for bank performance.
Design/methodology/approach
Different statistical and data mining techniques are used to second-stage DEA for bank performance as a part of an attempt to produce a powerful model for bank performance with effective predictive ability. The projected data mining tools are classification and regression trees (CART), conditional inference trees (CIT), random forest based on CART and CIT, bagging, artificial neural networks and their statistical counterpart, logistic regression.
Findings
The results showed that random forests and bagging outperform other methods in terms of predictive power.
Originality/value
This is the first study to assess the impact of environmental factors on banking performance in Middle East and North Africa countries.
Details
Keywords
Afreen Khan, Swaleha Zubair and Samreen Khan
This study aimed to assess the potential of the Clinical Dementia Rating (CDR) Scale in the prognosis of dementia in elderly subjects.
Abstract
Purpose
This study aimed to assess the potential of the Clinical Dementia Rating (CDR) Scale in the prognosis of dementia in elderly subjects.
Design/methodology/approach
Dementia staging severity is clinically an essential task, so the authors used machine learning (ML) on the magnetic resonance imaging (MRI) features to locate and study the impact of various MR readings onto the classification of demented and nondemented patients. The authors used cross-sectional MRI data in this study. The designed ML approach established the role of CDR in the prognosis of inflicted and normal patients. Moreover, the pattern analysis indicated CDR as a strong cohort amongst the various attributes, with CDR to have a significant value of p < 0.01. The authors employed 20 ML classifiers.
Findings
The mean prediction accuracy varied with the various ML classifier used, with the bagging classifier (random forest as a base estimator) achieving the highest (93.67%). A series of ML analyses demonstrated that the model including the CDR score had better prediction accuracy and other related performance metrics.
Originality/value
The results suggest that the CDR score, a simple clinical measure, can be used in real community settings. It can be used to predict dementia progression with ML modeling.
Details
Keywords
Bo Liu, Libin Shen, Huanling You, Yan Dong, Jianqiang Li and Yong Li
The influence of road surface temperature (RST) on vehicles is becoming more and more obvious. Accurate predication of RST is distinctly meaningful. At present, however, the…
Abstract
Purpose
The influence of road surface temperature (RST) on vehicles is becoming more and more obvious. Accurate predication of RST is distinctly meaningful. At present, however, the prediction accuracy of RST is not satisfied with physical methods or statistical learning methods. To find an effective prediction method, this paper selects five representative algorithms to predict the road surface temperature separately.
Design/methodology/approach
Multiple linear regressions, least absolute shrinkage and selection operator, random forest and gradient boosting regression tree (GBRT) and neural network are chosen to be representative predictors.
Findings
The experimental results show that for temperature data set of this experiment, the prediction effect of GBRT in the ensemble algorithm is the best compared with the other four algorithms.
Originality/value
This paper compares different kinds of machine learning algorithms, observes the road surface temperature data from different angles, and finds the most suitable prediction method.
Details
Keywords
Oscar F. Bustinza, Luis M. Molina Fernandez and Marlene Mendoza Macías
Machine learning (ML) analytical tools are increasingly being considered as an alternative quantitative methodology in management research. This paper proposes a new approach for…
Abstract
Purpose
Machine learning (ML) analytical tools are increasingly being considered as an alternative quantitative methodology in management research. This paper proposes a new approach for uncovering the antecedents behind product and product–service innovation (PSI).
Design/methodology/approach
The ML approach is novel in the field of innovation antecedents at the country level. A sample of the Equatorian National Survey on Technology and Innovation, consisting of more than 6,000 firms, is used to rank the antecedents of innovation.
Findings
The analysis reveals that the antecedents of product and PSI are distinct, yet rooted in the principles of open innovation and competitive priorities.
Research limitations/implications
The analysis is based on a sample of Equatorian firms with the objective of showing how ML techniques are suitable for testing the antecedents of innovation in any other context.
Originality/value
The novel ML approach, in contrast to traditional quantitative analysis of the topic, can consider the full set of antecedent interactions to each of the innovations analyzed.
Details
Keywords
Metropolitan areas suffer from frequent road traffic congestion not only during peak hours but also during off-peak periods. Different machine learning methods have been used in…
Abstract
Purpose
Metropolitan areas suffer from frequent road traffic congestion not only during peak hours but also during off-peak periods. Different machine learning methods have been used in travel time prediction, however, such machine learning methods practically face the problem of overfitting. Tree-based ensembles have been applied in various prediction fields, and such approaches usually produce high prediction accuracy by aggregating and averaging individual decision trees. The inherent advantages of these approaches not only get better prediction results but also have a good bias-variance trade-off which can help to avoid overfitting. However, the reality is that the application of tree-based integration algorithms in traffic prediction is still limited. This study aims to improve the accuracy and interpretability of the models by using random forest (RF) to analyze and model the travel time on freeways.
Design/methodology/approach
As the traffic conditions often greatly change, the prediction results are often unsatisfactory. To improve the accuracy of short-term travel time prediction in the freeway network, a practically feasible and computationally efficient RF prediction method for real-world freeways by using probe traffic data was generated. In addition, the variables’ relative importance was ranked, which provides an investigation platform to gain a better understanding of how different contributing factors might affect travel time on freeways.
Findings
The parameters of the RF model were estimated by using the training sample set. After the parameter tuning process was completed, the proposed RF model was developed. The features’ relative importance showed that the variables (travel time 15 min before) and time of day (TOD) contribute the most to the predicted travel time result. The model performance was also evaluated and compared against the extreme gradient boosting method and the results indicated that the RF always produces more accurate travel time predictions.
Originality/value
This research developed an RF method to predict the freeway travel time by using the probe vehicle-based traffic data and weather data. Detailed information about the input variables and data pre-processing were presented. To measure the effectiveness of proposed travel time prediction algorithms, the mean absolute percentage errors were computed for different observation segments combined with different prediction horizons ranging from 15 to 60 min.
Details
Keywords
Daniel Abreu Vasconcellos de Paula, Rinaldo Artes, Fabio Ayres and Andrea Maria Accioly Fonseca Minardi
Although credit unions are nonprofit organizations, their objectives depend on the efficient management of their resources and credit risk aligned with the principles of the…
Abstract
Purpose
Although credit unions are nonprofit organizations, their objectives depend on the efficient management of their resources and credit risk aligned with the principles of the cooperative doctrine. This paper aims to propose the combined use of credit scoring and profit scoring to increase the effectiveness of the loan-granting process in credit unions.
Design/methodology/approach
This sample is composed by the data of personal loans transactions of a Brazilian credit union.
Findings
The analysis reveals that the use of statistical methods improves significantly the predictability of default when compared to the use of subjective techniques and the superiority of the random forests model in estimating credit scoring and profit scoring when compared to logit and ordinary least squares method (OLS) regression. The study also illustrates how both analyses can be used jointly for more effective decision-making.
Originality/value
Replacing subjective analysis with objective credit analysis using deterministic models will benefit Brazilian credit unions. The credit decision will be based on the input variables and on clear criteria, turning the decision-making process impartial. The joint use of credit scoring and profit scoring allows granting credit for the clients with the highest potential to pay debt obligation and, at the same time, to certify that the transaction profitability meets the goals of the organization: to be sustainable and to provide loans and investment opportunities at attractive rates to members.
Details
Keywords
Showmitra Kumar Sarkar, Swapan Talukdar, Atiqur Rahman, Shahfahad and Sujit Kumar Roy
The present study aims to construct ensemble machine learning (EML) algorithms for groundwater potentiality mapping (GPM) in the Teesta River basin of Bangladesh, including random…
Abstract
Purpose
The present study aims to construct ensemble machine learning (EML) algorithms for groundwater potentiality mapping (GPM) in the Teesta River basin of Bangladesh, including random forest (RF) and random subspace (RSS).
Design/methodology/approach
The RF and RSS models have been implemented for integrating 14 selected groundwater condition parametres with groundwater inventories for generating GPMs. The GPM were then validated using the empirical and bionormal receiver operating characteristics (ROC) curve.
Findings
The very high (831–1200 km2) and high groundwater potential areas (521–680 km2) were predicted using EML algorithms. The RSS (AUC-0.892) model outperformed RF model based on ROC's area under curve (AUC).
Originality/value
Two new EML models have been constructed for GPM. These findings will aid in proposing sustainable water resource management plans.
Details
Keywords
The purpose of this paper is to contribute original evidence about the conditions for formal and informal contracts for commodities and labour in the waste economy of a South…
Abstract
Purpose
The purpose of this paper is to contribute original evidence about the conditions for formal and informal contracts for commodities and labour in the waste economy of a South Indian town.
Design/methodology/approach
Field research was exploratory, based on snowball sampling and urban traversing. The analysis follows capital and labour in the sub-circuits of capital generating waste in production, distribution, consumption, the production of labour and the reproduction of society.
Findings
Regardless of legal regulation, which is selectively enforced, formal contracts are limited to active inspection regimes; direct transactions with or within the state; and long-distance transactions. Formal labour contracts are least incomplete for state employment, and for relatively scarce skilled labour in the private sector.
Research limitations/implications
The research design does not permit quantified generalisations.
Practical implications
Waste management technology evaluations neglect the social costs of displacing a large informal labour force.
Social implications
While slowly dissolving occupational barriers of untouchability, the waste economy is a low-status labour absorber of last resort, exit from which is extremely difficult.
Originality/value
The first systematic exploration of formal and informal contracts in an Indian small-town waste economy.
Details
Keywords
Kiran Fahd, Shah Jahan Miah and Khandakar Ahmed
Student attritions in tertiary educational institutes may play a significant role to achieve core values leading towards strategic mission and financial well-being. Analysis of…
Abstract
Purpose
Student attritions in tertiary educational institutes may play a significant role to achieve core values leading towards strategic mission and financial well-being. Analysis of data generated from student interaction with learning management systems (LMSs) in blended learning (BL) environments may assist with the identification of students at risk of failing, but to what extent this may be possible is unknown. However, existing studies are limited to address the issues at a significant scale.
Design/methodology/approach
This study develops a new approach harnessing applications of machine learning (ML) models on a dataset, that is publicly available, relevant to student attrition to identify potential students at risk. The dataset consists of the data generated by the interaction of students with LMS for their BL environment.
Findings
Identifying students at risk through an innovative approach will promote timely intervention in the learning process, such as for improving student academic progress. To evaluate the performance of the proposed approach, the accuracy is compared with other representational ML methods.
Originality/value
The best ML algorithm random forest with 85% is selected to support educators in implementing various pedagogical practices to improve students’ learning.
Details
Keywords
Mariam AlKandari and Imtiaz Ahmad
Solar power forecasting will have a significant impact on the future of large-scale renewable energy plants. Predicting photovoltaic power generation depends heavily on climate…
Abstract
Solar power forecasting will have a significant impact on the future of large-scale renewable energy plants. Predicting photovoltaic power generation depends heavily on climate conditions, which fluctuate over time. In this research, we propose a hybrid model that combines machine-learning methods with Theta statistical method for more accurate prediction of future solar power generation from renewable energy plants. The machine learning models include long short-term memory (LSTM), gate recurrent unit (GRU), AutoEncoder LSTM (Auto-LSTM) and a newly proposed Auto-GRU. To enhance the accuracy of the proposed Machine learning and Statistical Hybrid Model (MLSHM), we employ two diversity techniques, i.e. structural diversity and data diversity. To combine the prediction of the ensemble members in the proposed MLSHM, we exploit four combining methods: simple averaging approach, weighted averaging using linear approach and using non-linear approach, and combination through variance using inverse approach. The proposed MLSHM scheme was validated on two real-time series datasets, that sre Shagaya in Kuwait and Cocoa in the USA. The experiments show that the proposed MLSHM, using all the combination methods, achieved higher accuracy compared to the prediction of the traditional individual models. Results demonstrate that a hybrid model combining machine-learning methods with statistical method outperformed a hybrid model that only combines machine-learning models without statistical method.
Details