Search results
1 – 10 of 797This study updates the literature review of Jones (1987) published in this journal. The study pays particular attention to two important themes that have shaped the field over the…
Abstract
Purpose
This study updates the literature review of Jones (1987) published in this journal. The study pays particular attention to two important themes that have shaped the field over the past 35 years: (1) the development of a range of innovative new statistical learning methods, particularly advanced machine learning methods such as stochastic gradient boosting, adaptive boosting, random forests and deep learning, and (2) the emergence of a wide variety of bankruptcy predictor variables extending beyond traditional financial ratios, including market-based variables, earnings management proxies, auditor going concern opinions (GCOs) and corporate governance attributes. Several directions for future research are discussed.
Design/methodology/approach
This study provides a systematic review of the corporate failure literature over the past 35 years with a particular focus on the emergence of new statistical learning methodologies and predictor variables. This synthesis of the literature evaluates the strength and limitations of different modelling approaches under different circumstances and provides an overall evaluation the relative contribution of alternative predictor variables. The study aims to provide a transparent, reproducible and interpretable review of the literature. The literature review also takes a theme-centric rather than author-centric approach and focuses on structured themes that have dominated the literature since 1987.
Findings
There are several major findings of this study. First, advanced machine learning methods appear to have the most promise for future firm failure research. Not only do these methods predict significantly better than conventional models, but they also possess many appealing statistical properties. Second, there are now a much wider range of variables being used to model and predict firm failure. However, the literature needs to be interpreted with some caution given the many mixed findings. Finally, there are still a number of unresolved methodological issues arising from the Jones (1987) study that still requiring research attention.
Originality/value
The study explains the connections and derivations between a wide range of firm failure models, from simpler linear models to advanced machine learning methods such as gradient boosting, random forests, adaptive boosting and deep learning. The paper highlights the most promising models for future research, particularly in terms of their predictive power, underlying statistical properties and issues of practical implementation. The study also draws together an extensive literature on alternative predictor variables and provides insights into the role and behaviour of alternative predictor variables in firm failure research.
Details
Keywords
Wen Li, Wei Wang and Wenjun Huo
Inspired by the basic idea of gradient boosting, this study aims to design a novel multivariate regression ensemble algorithm RegBoost by using multivariate linear regression as a…
Abstract
Purpose
Inspired by the basic idea of gradient boosting, this study aims to design a novel multivariate regression ensemble algorithm RegBoost by using multivariate linear regression as a weak predictor.
Design/methodology/approach
To achieve nonlinearity after combining all linear regression predictors, the training data is divided into two branches according to the prediction results using the current weak predictor. The linear regression modeling is recursively executed in two branches. In the test phase, test data is distributed to a specific branch to continue with the next weak predictor. The final result is the sum of all weak predictors across the entire path.
Findings
Through comparison experiments, it is found that the algorithm RegBoost can achieve similar performance to the gradient boosted decision tree (GBDT). The algorithm is very effective compared to linear regression.
Originality/value
This paper attempts to design a novel regression algorithm RegBoost with reference to GBDT. To the best of the knowledge, for the first time, RegBoost uses linear regression as a weak predictor, and combine with gradient boosting to build an ensemble algorithm.
Details
Keywords
Suraj Kulkarni, Suhas Suresh Ambekar and Manoj Hudnurkar
Increasing health-care costs are a major concern, especially in the USA. The purpose of this paper is to predict the hospital charges of a patient before being admitted. This will…
Abstract
Purpose
Increasing health-care costs are a major concern, especially in the USA. The purpose of this paper is to predict the hospital charges of a patient before being admitted. This will help a patient who is getting admitted: “electively” can plan his/her finance. Also, this can be used as a tool by payers (insurance companies) to better forecast the amount that a patient might claim.
Design/methodology/approach
This research method involves secondary data collected from New York state’s patient discharges of 2017. A stratified sampling technique is used to sample the data from the population, feature engineering is done on categorical variables. Different regression techniques are being used to predict the target value “total charges.”
Findings
Total cost varies linearly with the length of stay. Among all the machine learning algorithms considered, namely, random forest, stochastic gradient descent (SGD) regressor, K nearest neighbors regressor, extreme gradient boosting regressor and gradient boosting regressor, random forest regressor had the best accuracy with R2 value 0.7753. “Age group” was the most important predictor among all the features.
Practical implications
This model can be helpful for patients who want to compare the cost at different hospitals and can plan their finances accordingly in case of “elective” admission. Insurance companies can predict how much a patient with a particular medical condition might claim by getting admitted to the hospital.
Originality/value
Health care can be a costly affair if not planned properly. This research gives patients and insurance companies a better prediction of the total cost that they might incur.
Details
Keywords
Valeriia Baklanova, Aleksei Kurkin and Tamara Teplova
The primary objective of this research is to provide a precise interpretation of the constructed machine learning model and produce definitive summaries that can evaluate the…
Abstract
Purpose
The primary objective of this research is to provide a precise interpretation of the constructed machine learning model and produce definitive summaries that can evaluate the influence of investor sentiment on the overall sales of non-fungible token (NFT) assets. To achieve this objective, the NFT hype index was constructed as well as several approaches of XAI were employed to interpret Black Box models and assess the magnitude and direction of the impact of the features used.
Design/methodology/approach
The research paper involved the construction of a sentiment index termed the NFT hype index, which aims to measure the influence of market actors within the NFT industry. This index was created by analyzing written content posted by 62 high-profile individuals and opinion leaders on the social media platform Twitter. The authors collected posts from the Twitter accounts that were afterward classified by tonality with a help of natural language processing model VADER. Then the machine learning methods and XAI approaches (feature importance, permutation importance and SHAP) were applied to explain the obtained results.
Findings
The built index was subjected to rigorous analysis using the gradient boosting regressor model and explainable AI techniques, which confirmed its significant explanatory power. Remarkably, the NFT hype index exhibited a higher degree of predictive accuracy compared to the well-known sentiment indices.
Practical implications
The NFT hype index, constructed from Twitter textual data, functions as an innovative, sentiment-based indicator for investment decision-making in the NFT market. It offers investors unique insights into the market sentiment that can be used alongside conventional financial analysis techniques to enhance risk management, portfolio optimization and overall investment outcomes within the rapidly evolving NFT ecosystem. Thus, the index plays a crucial role in facilitating well-informed, data-driven investment decisions and ensuring a competitive edge in the digital assets market.
Originality/value
The authors developed a novel index of investor interest for NFT assets (NFT hype index) based on text messages posted by market influencers and compared it to conventional sentiment indices in terms of their explanatory power. With the application of explainable AI, it was shown that sentiment indices may perform as significant predictors for NFT sales and that the NFT hype index works best among all sentiment indices considered.
Details
Keywords
This study aimed to understand the agri-entrepreneurial traits of undergraduate university students using machine learning (ML) algorithms.
Abstract
Purpose
This study aimed to understand the agri-entrepreneurial traits of undergraduate university students using machine learning (ML) algorithms.
Design/methodology/approach
This study used a conceptual framework of individual-level determinants of entrepreneurship and ML. The Google Survey instrument was prepared on a 5-point scale and administered to 656 students in different sections of the same class during regular virtual classrooms in 2021. The datasets were analyzed and compared using ML.
Findings
Entrepreneurial traits existed among students before attending undergraduate entrepreneurship courses. Establishing strong partnerships (0.359), learning (0.347) and people-organizing ability (0.341) were promising correlated entrepreneurial traits. Female students exhibited fewer entrepreneurial traits than male students. The random forest model exhibited 60% accuracy in trait prediction against gradient boosting (58.4%), linear regression (56.8%), ridge (56.7%) and lasso regression (56.0%). Thus, the ML model appeared to be unsuitable to predict entrepreneurial traits. Quality data are important for accurate trait predictions.
Research limitations/implications
Further studies can validate K-nearest neighbors (KNN) and support vector machine (SVM) models against random forest to support the statement that the ML model cannot be used for entrepreneurial trait prediction.
Originality/value
This research is unique because ML models, such as random forest, gradient boosting and lasso regression, are used for entrepreneurial trait prediction by agricultural domain students.
Details
Keywords
Michael Mayer, Steven C. Bourassa, Martin Hoesli and Donato Scognamiglio
The purpose of this paper is to investigate the accuracy and volatility of different methods for estimating and updating hedonic valuation models.
Abstract
Purpose
The purpose of this paper is to investigate the accuracy and volatility of different methods for estimating and updating hedonic valuation models.
Design/methodology/approach
The authors apply six estimation methods (linear least squares, robust regression, mixed-effects regression, random forests, gradient boosting and neural networks) and two updating methods (moving and extending windows). They use a large and rich data set consisting of over 123,000 single-family houses sold in Switzerland between 2005 and 2017.
Findings
The gradient boosting method yields the greatest accuracy, while the robust method provides the least volatile predictions. There is a clear trade-off across methods depending on whether the goal is to improve accuracy or avoid volatility. The choice between moving and extending windows has only a modest effect on the results.
Originality/value
This paper compares a range of linear and machine learning techniques in the context of moving or extending window scenarios that are used in practice but which have not been considered in prior research. The techniques include robust regression, which has not previously been used in this context. The data updating allows for analysis of the volatility in addition to the accuracy of predictions. The results should prove useful in improving hedonic models used by property tax assessors, mortgage underwriters, valuation firms and regulatory authorities.
Details
Keywords
Nader Asadi Ejgerdi and Mehrdad Kazerooni
With the growth of organizations and businesses, customer acquisition and retention processes have become more complex in the long run. That is why customer lifetime value (CLV…
Abstract
Purpose
With the growth of organizations and businesses, customer acquisition and retention processes have become more complex in the long run. That is why customer lifetime value (CLV) has become crucial to sales managers. Predicting the CLV is a strategic weapon and competitive advantage in increasing profitability and identifying customers with more splendid profitability and is one of the essential key performance indicators (KPI) used in customer segmentation. Thus, this paper proposes a stacked ensemble learning method, a combination of multiple machine learning methods, for CLV prediction.
Design/methodology/approach
In order to utilize customers’ behavioral features for predicting the value of each customer’s CLV, the data of a textile sales company was used as a case study. The proposed stacked ensemble learning method is compared with several popular predictive methods named deep neural networks, bagging support vector regression, light gradient boosting machine, random forest and extreme gradient boosting.
Findings
Empirical results indicate that the regression performance of the stacked ensemble learning method outperformed other methods in terms of normalized rooted mean squared error, normalized mean absolute error and coefficient of determination, at 0.248, 0.364 and 0.848, respectively. In addition, the prediction capability of the proposed method improved significantly after optimizing its hyperparameters.
Originality/value
This paper proposes a stacked ensemble learning method as a new method for accurate CLV prediction. The results and comparisons support the robustness and efficiency of the proposed method for CLV prediction.
Details
Keywords
This study aims to examine the association between board gender diversity (BGD) and workplace diversity and the relative importance of various board and firm characteristics in…
Abstract
Purpose
This study aims to examine the association between board gender diversity (BGD) and workplace diversity and the relative importance of various board and firm characteristics in predicting diversity.
Design/methodology/approach
With a novel machine learning (ML) approach, this study models the association between three workplace diversity variables and BGD using a social media data set of approximately 250,000 employee reviews. Using the tools of explainable artificial intelligence, the authors interpret the results of the ML model.
Findings
The results show that BGD has a strong positive association with the gender equality and inclusiveness dimensions of corporate diversity culture. However, BGD is found to have a weak negative association with age diversity in a company. Furthermore, the authors find that workplace diversity is an important predictor of firm value, indicating a possible channel on how BGD affects firm performance.
Originality/value
The effects of BGD on workplace diversity below management levels are mainly omitted in the current corporate governance literature. Furthermore, existing research has not considered different dimensions of this diversity and has mainly focused on its gender aspects. In this study, the authors address this research problem and examine how BGD affects different dimensions of diversity at the overall company level. This study reveals important associations and identifies key variables that should be included as a part of theoretical causal models in future research.
Details
Keywords
Amirhessam Tahmassebi, Mehrtash Motamedi, Amir H. Alavi and Amir H. Gandomi
Engineering design and operational decisions depend largely on deep understanding of applications that requires assumptions for simplification of the problems in order to find…
Abstract
Purpose
Engineering design and operational decisions depend largely on deep understanding of applications that requires assumptions for simplification of the problems in order to find proper solutions. Cutting-edge machine learning algorithms can be used as one of the emerging tools to simplify this process. In this paper, we propose a novel scalable and interpretable machine learning framework to automate this process and fill the current gap.
Design/methodology/approach
The essential principles of the proposed pipeline are mainly (1) scalability, (2) interpretibility and (3) robust probabilistic performance across engineering problems. The lack of interpretibility of complex machine learning models prevents their use in various problems including engineering computation assessments. Many consumers of machine learning models would not trust the results if they cannot understand the method. Thus, the SHapley Additive exPlanations (SHAP) approach is employed to interpret the developed machine learning models.
Findings
The proposed framework can be applied to a variety of engineering problems including seismic damage assessment of structures. The performance of the proposed framework is investigated using two case studies of failure identification in reinforcement concrete (RC) columns and shear walls. In addition, the reproducibility, reliability and generalizability of the results were validated and the results of the framework were compared to the benchmark studies. The results of the proposed framework outperformed the benchmark results with high statistical significance.
Originality/value
Although, the current study reveals that the geometric input features and reinforcement indices are the most important variables in failure modes detection, better model can be achieved with employing more robust strategies to establish proper database to decrease the errors in some of the failure modes identification.
Details
Keywords
Yoonjae Hwang, Sungwon Jung and Eun Joo Park
Initiator crimes, also known as near-repeat crimes, occur in places with known risk factors and vulnerabilities based on prior crime-related experiences or information…
Abstract
Purpose
Initiator crimes, also known as near-repeat crimes, occur in places with known risk factors and vulnerabilities based on prior crime-related experiences or information. Consequently, the environment in which initiator crimes occur might be different from more general crime environments. This study aimed to analyse the differences between the environments of initiator crimes and general crimes, confirming the need for predicting initiator crimes.
Design/methodology/approach
We compared predictive models using data corresponding to initiator crimes and all residential burglaries without considering repetitive crime patterns as dependent variables. Using random forest and gradient boosting, representative ensemble models and predictive models were compared utilising various environmental factor data. Subsequently, we evaluated the performance of each predictive model to derive feature importance and partial dependence based on a highly predictive model.
Findings
By analysing environmental factors affecting overall residential burglary and initiator crimes, we observed notable differences in high-importance variables. Further analysis of the partial dependence of total residential burglary and initiator crimes based on these variables revealed distinct impacts on each crime. Moreover, initiator crimes took place in environments consistent with well-known theories in the field of environmental criminology.
Originality/value
Our findings indicate the possibility that results that do not appear through the existing theft crime prediction method will be identified in the initiator crime prediction model. Emphasising the importance of investigating the environments in which initiator crimes occur, this study underscores the potential of artificial intelligence (AI)-based approaches in creating a safe urban environment. By effectively preventing potential crimes, AI-driven prediction of initiator crimes can significantly contribute to enhancing urban safety.
Details