Search results

1 – 10 of 797
Article
Publication date: 17 March 2023

Stewart Jones

This study updates the literature review of Jones (1987) published in this journal. The study pays particular attention to two important themes that have shaped the field over the…

Abstract

Purpose

This study updates the literature review of Jones (1987) published in this journal. The study pays particular attention to two important themes that have shaped the field over the past 35 years: (1) the development of a range of innovative new statistical learning methods, particularly advanced machine learning methods such as stochastic gradient boosting, adaptive boosting, random forests and deep learning, and (2) the emergence of a wide variety of bankruptcy predictor variables extending beyond traditional financial ratios, including market-based variables, earnings management proxies, auditor going concern opinions (GCOs) and corporate governance attributes. Several directions for future research are discussed.

Design/methodology/approach

This study provides a systematic review of the corporate failure literature over the past 35 years with a particular focus on the emergence of new statistical learning methodologies and predictor variables. This synthesis of the literature evaluates the strength and limitations of different modelling approaches under different circumstances and provides an overall evaluation the relative contribution of alternative predictor variables. The study aims to provide a transparent, reproducible and interpretable review of the literature. The literature review also takes a theme-centric rather than author-centric approach and focuses on structured themes that have dominated the literature since 1987.

Findings

There are several major findings of this study. First, advanced machine learning methods appear to have the most promise for future firm failure research. Not only do these methods predict significantly better than conventional models, but they also possess many appealing statistical properties. Second, there are now a much wider range of variables being used to model and predict firm failure. However, the literature needs to be interpreted with some caution given the many mixed findings. Finally, there are still a number of unresolved methodological issues arising from the Jones (1987) study that still requiring research attention.

Originality/value

The study explains the connections and derivations between a wide range of firm failure models, from simpler linear models to advanced machine learning methods such as gradient boosting, random forests, adaptive boosting and deep learning. The paper highlights the most promising models for future research, particularly in terms of their predictive power, underlying statistical properties and issues of practical implementation. The study also draws together an extensive literature on alternative predictor variables and provides insights into the role and behaviour of alternative predictor variables in firm failure research.

Details

Journal of Accounting Literature, vol. 45 no. 2
Type: Research Article
ISSN: 0737-4607

Keywords

Open Access
Article
Publication date: 3 February 2020

Wen Li, Wei Wang and Wenjun Huo

Inspired by the basic idea of gradient boosting, this study aims to design a novel multivariate regression ensemble algorithm RegBoost by using multivariate linear regression as a…

4517

Abstract

Purpose

Inspired by the basic idea of gradient boosting, this study aims to design a novel multivariate regression ensemble algorithm RegBoost by using multivariate linear regression as a weak predictor.

Design/methodology/approach

To achieve nonlinearity after combining all linear regression predictors, the training data is divided into two branches according to the prediction results using the current weak predictor. The linear regression modeling is recursively executed in two branches. In the test phase, test data is distributed to a specific branch to continue with the next weak predictor. The final result is the sum of all weak predictors across the entire path.

Findings

Through comparison experiments, it is found that the algorithm RegBoost can achieve similar performance to the gradient boosted decision tree (GBDT). The algorithm is very effective compared to linear regression.

Originality/value

This paper attempts to design a novel regression algorithm RegBoost with reference to GBDT. To the best of the knowledge, for the first time, RegBoost uses linear regression as a weak predictor, and combine with gradient boosting to build an ensemble algorithm.

Details

International Journal of Crowd Science, vol. 4 no. 1
Type: Research Article
ISSN: 2398-7294

Keywords

Article
Publication date: 30 December 2020

Suraj Kulkarni, Suhas Suresh Ambekar and Manoj Hudnurkar

Increasing health-care costs are a major concern, especially in the USA. The purpose of this paper is to predict the hospital charges of a patient before being admitted. This will…

Abstract

Purpose

Increasing health-care costs are a major concern, especially in the USA. The purpose of this paper is to predict the hospital charges of a patient before being admitted. This will help a patient who is getting admitted: “electively” can plan his/her finance. Also, this can be used as a tool by payers (insurance companies) to better forecast the amount that a patient might claim.

Design/methodology/approach

This research method involves secondary data collected from New York state’s patient discharges of 2017. A stratified sampling technique is used to sample the data from the population, feature engineering is done on categorical variables. Different regression techniques are being used to predict the target value “total charges.”

Findings

Total cost varies linearly with the length of stay. Among all the machine learning algorithms considered, namely, random forest, stochastic gradient descent (SGD) regressor, K nearest neighbors regressor, extreme gradient boosting regressor and gradient boosting regressor, random forest regressor had the best accuracy with R2 value 0.7753. “Age group” was the most important predictor among all the features.

Practical implications

This model can be helpful for patients who want to compare the cost at different hospitals and can plan their finances accordingly in case of “elective” admission. Insurance companies can predict how much a patient with a particular medical condition might claim by getting admitted to the hospital.

Originality/value

Health care can be a costly affair if not planned properly. This research gives patients and insurance companies a better prediction of the total cost that they might incur.

Details

International Journal of Innovation Science, vol. 13 no. 1
Type: Research Article
ISSN: 1757-2223

Keywords

Article
Publication date: 5 December 2023

Valeriia Baklanova, Aleksei Kurkin and Tamara Teplova

The primary objective of this research is to provide a precise interpretation of the constructed machine learning model and produce definitive summaries that can evaluate the…

Abstract

Purpose

The primary objective of this research is to provide a precise interpretation of the constructed machine learning model and produce definitive summaries that can evaluate the influence of investor sentiment on the overall sales of non-fungible token (NFT) assets. To achieve this objective, the NFT hype index was constructed as well as several approaches of XAI were employed to interpret Black Box models and assess the magnitude and direction of the impact of the features used.

Design/methodology/approach

The research paper involved the construction of a sentiment index termed the NFT hype index, which aims to measure the influence of market actors within the NFT industry. This index was created by analyzing written content posted by 62 high-profile individuals and opinion leaders on the social media platform Twitter. The authors collected posts from the Twitter accounts that were afterward classified by tonality with a help of natural language processing model VADER. Then the machine learning methods and XAI approaches (feature importance, permutation importance and SHAP) were applied to explain the obtained results.

Findings

The built index was subjected to rigorous analysis using the gradient boosting regressor model and explainable AI techniques, which confirmed its significant explanatory power. Remarkably, the NFT hype index exhibited a higher degree of predictive accuracy compared to the well-known sentiment indices.

Practical implications

The NFT hype index, constructed from Twitter textual data, functions as an innovative, sentiment-based indicator for investment decision-making in the NFT market. It offers investors unique insights into the market sentiment that can be used alongside conventional financial analysis techniques to enhance risk management, portfolio optimization and overall investment outcomes within the rapidly evolving NFT ecosystem. Thus, the index plays a crucial role in facilitating well-informed, data-driven investment decisions and ensuring a competitive edge in the digital assets market.

Originality/value

The authors developed a novel index of investor interest for NFT assets (NFT hype index) based on text messages posted by market influencers and compared it to conventional sentiment indices in terms of their explanatory power. With the application of explainable AI, it was shown that sentiment indices may perform as significant predictors for NFT sales and that the NFT hype index works best among all sentiment indices considered.

Details

China Finance Review International, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2044-1398

Keywords

Article
Publication date: 14 February 2023

Sapna Jarial and Jayant Verma

This study aimed to understand the agri-entrepreneurial traits of undergraduate university students using machine learning (ML) algorithms.

Abstract

Purpose

This study aimed to understand the agri-entrepreneurial traits of undergraduate university students using machine learning (ML) algorithms.

Design/methodology/approach

This study used a conceptual framework of individual-level determinants of entrepreneurship and ML. The Google Survey instrument was prepared on a 5-point scale and administered to 656 students in different sections of the same class during regular virtual classrooms in 2021. The datasets were analyzed and compared using ML.

Findings

Entrepreneurial traits existed among students before attending undergraduate entrepreneurship courses. Establishing strong partnerships (0.359), learning (0.347) and people-organizing ability (0.341) were promising correlated entrepreneurial traits. Female students exhibited fewer entrepreneurial traits than male students. The random forest model exhibited 60% accuracy in trait prediction against gradient boosting (58.4%), linear regression (56.8%), ridge (56.7%) and lasso regression (56.0%). Thus, the ML model appeared to be unsuitable to predict entrepreneurial traits. Quality data are important for accurate trait predictions.

Research limitations/implications

Further studies can validate K-nearest neighbors (KNN) and support vector machine (SVM) models against random forest to support the statement that the ML model cannot be used for entrepreneurial trait prediction.

Originality/value

This research is unique because ML models, such as random forest, gradient boosting and lasso regression, are used for entrepreneurial trait prediction by agricultural domain students.

Details

Journal of Agribusiness in Developing and Emerging Economies, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2044-0839

Keywords

Article
Publication date: 3 April 2019

Michael Mayer, Steven C. Bourassa, Martin Hoesli and Donato Scognamiglio

The purpose of this paper is to investigate the accuracy and volatility of different methods for estimating and updating hedonic valuation models.

Abstract

Purpose

The purpose of this paper is to investigate the accuracy and volatility of different methods for estimating and updating hedonic valuation models.

Design/methodology/approach

The authors apply six estimation methods (linear least squares, robust regression, mixed-effects regression, random forests, gradient boosting and neural networks) and two updating methods (moving and extending windows). They use a large and rich data set consisting of over 123,000 single-family houses sold in Switzerland between 2005 and 2017.

Findings

The gradient boosting method yields the greatest accuracy, while the robust method provides the least volatile predictions. There is a clear trade-off across methods depending on whether the goal is to improve accuracy or avoid volatility. The choice between moving and extending windows has only a modest effect on the results.

Originality/value

This paper compares a range of linear and machine learning techniques in the context of moving or extending window scenarios that are used in practice but which have not been considered in prior research. The techniques include robust regression, which has not previously been used in this context. The data updating allows for analysis of the volatility in addition to the accuracy of predictions. The results should prove useful in improving hedonic models used by property tax assessors, mortgage underwriters, valuation firms and regulatory authorities.

Details

Journal of European Real Estate Research, vol. 12 no. 1
Type: Research Article
ISSN: 1753-9269

Keywords

Article
Publication date: 30 March 2023

Nader Asadi Ejgerdi and Mehrdad Kazerooni

With the growth of organizations and businesses, customer acquisition and retention processes have become more complex in the long run. That is why customer lifetime value (CLV…

Abstract

Purpose

With the growth of organizations and businesses, customer acquisition and retention processes have become more complex in the long run. That is why customer lifetime value (CLV) has become crucial to sales managers. Predicting the CLV is a strategic weapon and competitive advantage in increasing profitability and identifying customers with more splendid profitability and is one of the essential key performance indicators (KPI) used in customer segmentation. Thus, this paper proposes a stacked ensemble learning method, a combination of multiple machine learning methods, for CLV prediction.

Design/methodology/approach

In order to utilize customers’ behavioral features for predicting the value of each customer’s CLV, the data of a textile sales company was used as a case study. The proposed stacked ensemble learning method is compared with several popular predictive methods named deep neural networks, bagging support vector regression, light gradient boosting machine, random forest and extreme gradient boosting.

Findings

Empirical results indicate that the regression performance of the stacked ensemble learning method outperformed other methods in terms of normalized rooted mean squared error, normalized mean absolute error and coefficient of determination, at 0.248, 0.364 and 0.848, respectively. In addition, the prediction capability of the proposed method improved significantly after optimizing its hyperparameters.

Originality/value

This paper proposes a stacked ensemble learning method as a new method for accurate CLV prediction. The results and comparisons support the robustness and efficiency of the proposed method for CLV prediction.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0368-492X

Keywords

Open Access
Article
Publication date: 25 January 2023

Mikko Ranta and Mika Ylinen

This study aims to examine the association between board gender diversity (BGD) and workplace diversity and the relative importance of various board and firm characteristics in…

4896

Abstract

Purpose

This study aims to examine the association between board gender diversity (BGD) and workplace diversity and the relative importance of various board and firm characteristics in predicting diversity.

Design/methodology/approach

With a novel machine learning (ML) approach, this study models the association between three workplace diversity variables and BGD using a social media data set of approximately 250,000 employee reviews. Using the tools of explainable artificial intelligence, the authors interpret the results of the ML model.

Findings

The results show that BGD has a strong positive association with the gender equality and inclusiveness dimensions of corporate diversity culture. However, BGD is found to have a weak negative association with age diversity in a company. Furthermore, the authors find that workplace diversity is an important predictor of firm value, indicating a possible channel on how BGD affects firm performance.

Originality/value

The effects of BGD on workplace diversity below management levels are mainly omitted in the current corporate governance literature. Furthermore, existing research has not considered different dimensions of this diversity and has mainly focused on its gender aspects. In this study, the authors address this research problem and examine how BGD affects different dimensions of diversity at the overall company level. This study reveals important associations and identifies key variables that should be included as a part of theoretical causal models in future research.

Details

Corporate Governance: The International Journal of Business in Society, vol. 23 no. 5
Type: Research Article
ISSN: 1472-0701

Keywords

Article
Publication date: 7 July 2021

Amirhessam Tahmassebi, Mehrtash Motamedi, Amir H. Alavi and Amir H. Gandomi

Engineering design and operational decisions depend largely on deep understanding of applications that requires assumptions for simplification of the problems in order to find…

207

Abstract

Purpose

Engineering design and operational decisions depend largely on deep understanding of applications that requires assumptions for simplification of the problems in order to find proper solutions. Cutting-edge machine learning algorithms can be used as one of the emerging tools to simplify this process. In this paper, we propose a novel scalable and interpretable machine learning framework to automate this process and fill the current gap.

Design/methodology/approach

The essential principles of the proposed pipeline are mainly (1) scalability, (2) interpretibility and (3) robust probabilistic performance across engineering problems. The lack of interpretibility of complex machine learning models prevents their use in various problems including engineering computation assessments. Many consumers of machine learning models would not trust the results if they cannot understand the method. Thus, the SHapley Additive exPlanations (SHAP) approach is employed to interpret the developed machine learning models.

Findings

The proposed framework can be applied to a variety of engineering problems including seismic damage assessment of structures. The performance of the proposed framework is investigated using two case studies of failure identification in reinforcement concrete (RC) columns and shear walls. In addition, the reproducibility, reliability and generalizability of the results were validated and the results of the framework were compared to the benchmark studies. The results of the proposed framework outperformed the benchmark results with high statistical significance.

Originality/value

Although, the current study reveals that the geometric input features and reinforcement indices are the most important variables in failure modes detection, better model can be achieved with employing more robust strategies to establish proper database to decrease the errors in some of the failure modes identification.

Details

Engineering Computations, vol. 39 no. 2
Type: Research Article
ISSN: 0264-4401

Keywords

Article
Publication date: 28 February 2024

Yoonjae Hwang, Sungwon Jung and Eun Joo Park

Initiator crimes, also known as near-repeat crimes, occur in places with known risk factors and vulnerabilities based on prior crime-related experiences or information…

100

Abstract

Purpose

Initiator crimes, also known as near-repeat crimes, occur in places with known risk factors and vulnerabilities based on prior crime-related experiences or information. Consequently, the environment in which initiator crimes occur might be different from more general crime environments. This study aimed to analyse the differences between the environments of initiator crimes and general crimes, confirming the need for predicting initiator crimes.

Design/methodology/approach

We compared predictive models using data corresponding to initiator crimes and all residential burglaries without considering repetitive crime patterns as dependent variables. Using random forest and gradient boosting, representative ensemble models and predictive models were compared utilising various environmental factor data. Subsequently, we evaluated the performance of each predictive model to derive feature importance and partial dependence based on a highly predictive model.

Findings

By analysing environmental factors affecting overall residential burglary and initiator crimes, we observed notable differences in high-importance variables. Further analysis of the partial dependence of total residential burglary and initiator crimes based on these variables revealed distinct impacts on each crime. Moreover, initiator crimes took place in environments consistent with well-known theories in the field of environmental criminology.

Originality/value

Our findings indicate the possibility that results that do not appear through the existing theft crime prediction method will be identified in the initiator crime prediction model. Emphasising the importance of investigating the environments in which initiator crimes occur, this study underscores the potential of artificial intelligence (AI)-based approaches in creating a safe urban environment. By effectively preventing potential crimes, AI-driven prediction of initiator crimes can significantly contribute to enhancing urban safety.

Details

Archnet-IJAR: International Journal of Architectural Research, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2631-6862

Keywords

1 – 10 of 797