Search results

1 – 10 of 629
Article
Publication date: 17 March 2023

Stewart Jones

This study updates the literature review of Jones (1987) published in this journal. The study pays particular attention to two important themes that have shaped the field over the…

Abstract

Purpose

This study updates the literature review of Jones (1987) published in this journal. The study pays particular attention to two important themes that have shaped the field over the past 35 years: (1) the development of a range of innovative new statistical learning methods, particularly advanced machine learning methods such as stochastic gradient boosting, adaptive boosting, random forests and deep learning, and (2) the emergence of a wide variety of bankruptcy predictor variables extending beyond traditional financial ratios, including market-based variables, earnings management proxies, auditor going concern opinions (GCOs) and corporate governance attributes. Several directions for future research are discussed.

Design/methodology/approach

This study provides a systematic review of the corporate failure literature over the past 35 years with a particular focus on the emergence of new statistical learning methodologies and predictor variables. This synthesis of the literature evaluates the strength and limitations of different modelling approaches under different circumstances and provides an overall evaluation the relative contribution of alternative predictor variables. The study aims to provide a transparent, reproducible and interpretable review of the literature. The literature review also takes a theme-centric rather than author-centric approach and focuses on structured themes that have dominated the literature since 1987.

Findings

There are several major findings of this study. First, advanced machine learning methods appear to have the most promise for future firm failure research. Not only do these methods predict significantly better than conventional models, but they also possess many appealing statistical properties. Second, there are now a much wider range of variables being used to model and predict firm failure. However, the literature needs to be interpreted with some caution given the many mixed findings. Finally, there are still a number of unresolved methodological issues arising from the Jones (1987) study that still requiring research attention.

Originality/value

The study explains the connections and derivations between a wide range of firm failure models, from simpler linear models to advanced machine learning methods such as gradient boosting, random forests, adaptive boosting and deep learning. The paper highlights the most promising models for future research, particularly in terms of their predictive power, underlying statistical properties and issues of practical implementation. The study also draws together an extensive literature on alternative predictor variables and provides insights into the role and behaviour of alternative predictor variables in firm failure research.

Details

Journal of Accounting Literature, vol. 45 no. 2
Type: Research Article
ISSN: 0737-4607

Keywords

Open Access
Article
Publication date: 3 February 2020

Wen Li, Wei Wang and Wenjun Huo

Inspired by the basic idea of gradient boosting, this study aims to design a novel multivariate regression ensemble algorithm RegBoost by using multivariate linear regression as a…

4525

Abstract

Purpose

Inspired by the basic idea of gradient boosting, this study aims to design a novel multivariate regression ensemble algorithm RegBoost by using multivariate linear regression as a weak predictor.

Design/methodology/approach

To achieve nonlinearity after combining all linear regression predictors, the training data is divided into two branches according to the prediction results using the current weak predictor. The linear regression modeling is recursively executed in two branches. In the test phase, test data is distributed to a specific branch to continue with the next weak predictor. The final result is the sum of all weak predictors across the entire path.

Findings

Through comparison experiments, it is found that the algorithm RegBoost can achieve similar performance to the gradient boosted decision tree (GBDT). The algorithm is very effective compared to linear regression.

Originality/value

This paper attempts to design a novel regression algorithm RegBoost with reference to GBDT. To the best of the knowledge, for the first time, RegBoost uses linear regression as a weak predictor, and combine with gradient boosting to build an ensemble algorithm.

Details

International Journal of Crowd Science, vol. 4 no. 1
Type: Research Article
ISSN: 2398-7294

Keywords

Article
Publication date: 30 December 2020

Suraj Kulkarni, Suhas Suresh Ambekar and Manoj Hudnurkar

Increasing health-care costs are a major concern, especially in the USA. The purpose of this paper is to predict the hospital charges of a patient before being admitted. This will…

Abstract

Purpose

Increasing health-care costs are a major concern, especially in the USA. The purpose of this paper is to predict the hospital charges of a patient before being admitted. This will help a patient who is getting admitted: “electively” can plan his/her finance. Also, this can be used as a tool by payers (insurance companies) to better forecast the amount that a patient might claim.

Design/methodology/approach

This research method involves secondary data collected from New York state’s patient discharges of 2017. A stratified sampling technique is used to sample the data from the population, feature engineering is done on categorical variables. Different regression techniques are being used to predict the target value “total charges.”

Findings

Total cost varies linearly with the length of stay. Among all the machine learning algorithms considered, namely, random forest, stochastic gradient descent (SGD) regressor, K nearest neighbors regressor, extreme gradient boosting regressor and gradient boosting regressor, random forest regressor had the best accuracy with R2 value 0.7753. “Age group” was the most important predictor among all the features.

Practical implications

This model can be helpful for patients who want to compare the cost at different hospitals and can plan their finances accordingly in case of “elective” admission. Insurance companies can predict how much a patient with a particular medical condition might claim by getting admitted to the hospital.

Originality/value

Health care can be a costly affair if not planned properly. This research gives patients and insurance companies a better prediction of the total cost that they might incur.

Details

International Journal of Innovation Science, vol. 13 no. 1
Type: Research Article
ISSN: 1757-2223

Keywords

Article
Publication date: 3 April 2019

Michael Mayer, Steven C. Bourassa, Martin Hoesli and Donato Scognamiglio

The purpose of this paper is to investigate the accuracy and volatility of different methods for estimating and updating hedonic valuation models.

Abstract

Purpose

The purpose of this paper is to investigate the accuracy and volatility of different methods for estimating and updating hedonic valuation models.

Design/methodology/approach

The authors apply six estimation methods (linear least squares, robust regression, mixed-effects regression, random forests, gradient boosting and neural networks) and two updating methods (moving and extending windows). They use a large and rich data set consisting of over 123,000 single-family houses sold in Switzerland between 2005 and 2017.

Findings

The gradient boosting method yields the greatest accuracy, while the robust method provides the least volatile predictions. There is a clear trade-off across methods depending on whether the goal is to improve accuracy or avoid volatility. The choice between moving and extending windows has only a modest effect on the results.

Originality/value

This paper compares a range of linear and machine learning techniques in the context of moving or extending window scenarios that are used in practice but which have not been considered in prior research. The techniques include robust regression, which has not previously been used in this context. The data updating allows for analysis of the volatility in addition to the accuracy of predictions. The results should prove useful in improving hedonic models used by property tax assessors, mortgage underwriters, valuation firms and regulatory authorities.

Details

Journal of European Real Estate Research, vol. 12 no. 1
Type: Research Article
ISSN: 1753-9269

Keywords

Open Access
Article
Publication date: 25 January 2023

Mikko Ranta and Mika Ylinen

This study aims to examine the association between board gender diversity (BGD) and workplace diversity and the relative importance of various board and firm characteristics in…

4978

Abstract

Purpose

This study aims to examine the association between board gender diversity (BGD) and workplace diversity and the relative importance of various board and firm characteristics in predicting diversity.

Design/methodology/approach

With a novel machine learning (ML) approach, this study models the association between three workplace diversity variables and BGD using a social media data set of approximately 250,000 employee reviews. Using the tools of explainable artificial intelligence, the authors interpret the results of the ML model.

Findings

The results show that BGD has a strong positive association with the gender equality and inclusiveness dimensions of corporate diversity culture. However, BGD is found to have a weak negative association with age diversity in a company. Furthermore, the authors find that workplace diversity is an important predictor of firm value, indicating a possible channel on how BGD affects firm performance.

Originality/value

The effects of BGD on workplace diversity below management levels are mainly omitted in the current corporate governance literature. Furthermore, existing research has not considered different dimensions of this diversity and has mainly focused on its gender aspects. In this study, the authors address this research problem and examine how BGD affects different dimensions of diversity at the overall company level. This study reveals important associations and identifies key variables that should be included as a part of theoretical causal models in future research.

Details

Corporate Governance: The International Journal of Business in Society, vol. 23 no. 5
Type: Research Article
ISSN: 1472-0701

Keywords

Article
Publication date: 7 July 2021

Amirhessam Tahmassebi, Mehrtash Motamedi, Amir H. Alavi and Amir H. Gandomi

Engineering design and operational decisions depend largely on deep understanding of applications that requires assumptions for simplification of the problems in order to find…

207

Abstract

Purpose

Engineering design and operational decisions depend largely on deep understanding of applications that requires assumptions for simplification of the problems in order to find proper solutions. Cutting-edge machine learning algorithms can be used as one of the emerging tools to simplify this process. In this paper, we propose a novel scalable and interpretable machine learning framework to automate this process and fill the current gap.

Design/methodology/approach

The essential principles of the proposed pipeline are mainly (1) scalability, (2) interpretibility and (3) robust probabilistic performance across engineering problems. The lack of interpretibility of complex machine learning models prevents their use in various problems including engineering computation assessments. Many consumers of machine learning models would not trust the results if they cannot understand the method. Thus, the SHapley Additive exPlanations (SHAP) approach is employed to interpret the developed machine learning models.

Findings

The proposed framework can be applied to a variety of engineering problems including seismic damage assessment of structures. The performance of the proposed framework is investigated using two case studies of failure identification in reinforcement concrete (RC) columns and shear walls. In addition, the reproducibility, reliability and generalizability of the results were validated and the results of the framework were compared to the benchmark studies. The results of the proposed framework outperformed the benchmark results with high statistical significance.

Originality/value

Although, the current study reveals that the geometric input features and reinforcement indices are the most important variables in failure modes detection, better model can be achieved with employing more robust strategies to establish proper database to decrease the errors in some of the failure modes identification.

Details

Engineering Computations, vol. 39 no. 2
Type: Research Article
ISSN: 0264-4401

Keywords

Article
Publication date: 9 November 2022

Meryem Uluskan and Merve Gizem Karşı

This study aims to emphasize utilization of Predictive Six Sigma to achieve process improvements based on machine learning (ML) techniques embedded in define, measure, analyze…

Abstract

Purpose

This study aims to emphasize utilization of Predictive Six Sigma to achieve process improvements based on machine learning (ML) techniques embedded in define, measure, analyze, improve, control (DMAIC). With this aim, this study presents selection and utilization of ML techniques, including multiple linear regression (MLR), artificial neural network (ANN), random forests (RF), gradient boosting machines (GBM) and k-nearest neighbors (k-NN) in the analyze and improve phases of Six Sigma DMAIC.

Design/methodology/approach

A data set containing 320 observations with nine input and one output variables is used. To achieve the objective which was to decrease the number of fabric defects, five ML techniques were compared in terms of prediction performance and best tools were selected. Next, most important causes of defects were determined via these tools. Finally, parameter optimization was conducted for minimum number of defects.

Findings

Among five ML tools, ANN, GBM and RF are found to be the best predictors. Out of nine potential causes, “machine speed” and “fabric width” are determined as the most important variables by using these tools. Then, optimum values for “machine speed” and “fabric width” for fabric defect minimization are determined both via regression response optimizer and ANN surface optimization. Ultimately, average defect number was decreased from 13/roll to 3/roll, which is a considerable decrease attained through utilization of ML techniques in Six Sigma.

Originality/value

Addressing an important gap in Six Sigma literature, in this study, certain ML techniques (i.e. MLR, ANN, RF, GBM and k-NN) are compared and the ones possessing best performances are used in the analyze and improve phases of Six Sigma DMAIC.

Article
Publication date: 4 October 2019

Rahul Priyadarshi, Akash Panigrahi, Srikanta Routroy and Girish Kant Garg

The purpose of this study is to select the appropriate forecasting model at the retail stage for selected vegetables on the basis of performance analysis.

1824

Abstract

Purpose

The purpose of this study is to select the appropriate forecasting model at the retail stage for selected vegetables on the basis of performance analysis.

Design/methodology/approach

Various forecasting models such as the Box–Jenkins-based auto-regressive integrated moving average model and machine learning-based algorithms such as long short-term memory (LSTM) networks, support vector regression (SVR), random forest regression, gradient boosting regression (GBR) and extreme GBR (XGBoost/XGBR) were proposed and applied (i.e. modeling, training, testing and predicting) at the retail stage for selected vegetables to forecast demand. The performance analysis (i.e. forecasting error analysis) was carried out to select the appropriate forecasting model at the retail stage for selected vegetables.

Findings

From the obtained results for a case environment, it was observed that the machine learning algorithms, namely LSTM and SVR, produced the better results in comparison with other different demand forecasting models.

Research limitations/implications

The results obtained from the case environment cannot be generalized. However, it may be used for forecasting of different agriculture produces at the retail stage, capturing their demand environment.

Practical implications

The implementation of LSTM and SVR for the case situation at the retail stage will reduce the forecast error, daily retail inventory and fresh produce wastage and will increase the daily revenue.

Originality/value

The demand forecasting model selection for agriculture produce at the retail stage on the basis of performance analysis is a unique study where both traditional and non-traditional models were analyzed and compared.

Article
Publication date: 15 January 2018

Péter Martinek and Oliver Krammer

This paper aims to present a robust prediction method for estimating the quality of electronic products assembled with pin-in-paste soldering technology. A specific board quality…

Abstract

Purpose

This paper aims to present a robust prediction method for estimating the quality of electronic products assembled with pin-in-paste soldering technology. A specific board quality factor was also defined which describes the expected yield of the board assembly.

Design/methodology/approach

Experiments were performed to obtain the required input data for developing a prediction method based on decision tree learning techniques. A Type 4 lead-free solder paste (particle size 20–38 µm) was deposited by stencil printing with different printing speeds (from 20 mm/s to 70 mm/s) into the through-holes (0.8 mm, 1 mm, 1.1 mm, 1.4 mm) of an FR4 board. Hole-filling was investigated with X-ray analyses. Three test cases were evaluated.

Findings

The optimal parameters of the algorithm were determined as: subsample is 0.5, learning rate is 0.001, maximum tree depth is 6 and boosting iteration is 10,000. The mean absolute error, root mean square error and mean absolute percentage error resulted in 0.024, 0.03 and 3.5, respectively, on average for the prediction of the hole-filling value, based on the printing speed and hole-diameter after optimisation. Our method is able to predict the hole-filling in pin-in-paste technology for different through-hole diameters.

Originality/value

No research works are available in current literature regarding machine learning techniques for pin-in-paste technology. Therefore, we decided to develop a method using decision tree learning techniques for supporting the design of the stencil printing process for through-hole components and pin-in-paste technology. The first pass yield of the assembly can be enhanced, and the reflow soldering failures of pin-in-paste technology can be significantly reduced.

Details

Soldering & Surface Mount Technology, vol. 30 no. 3
Type: Research Article
ISSN: 0954-0911

Keywords

Article
Publication date: 16 December 2022

Fatemeh Mozaffari, Marzieh Rahimi, Hamidreza Yazdani and Babak Sohrabi

This research intends to develop a model for predicting employees at a high-risk attrition and identify the most important factors affecting them.

Abstract

Purpose

This research intends to develop a model for predicting employees at a high-risk attrition and identify the most important factors affecting them.

Design/methodology/approach

In this study, using the triangulation technique of a mixed research method, the employee attrition problem is investigated by identifying its affecting factors. For that matter, data related to the human resources department of a pharmaceutical company in Iran are used. And to achieve the intended goal, advanced data mining algorithms and interviews with human resource managers are applied.

Findings

A model for predicting employees at a high-risk attrition is presented based on the gradient boosting machine algorithm with 89% accuracy. The use of the mixed research approach shows that qualitative and quantitative methods can be more effective in identifying the factors affecting employee churn or loss of staff. The results also contain a new situation arising out of the COVID-19 pandemic and remote working scenarios having impact on employee attrition. Finally, human resource policies are presented based on variables related to each of the identified factors.

Originality/value

The novel contributions of this study include real data related to a leading pharmaceutical company as well as a combination of two quantitative and qualitative methods. The hybrid approach can identify the reasons for attrition and, consequently, retention policies to benefit from the advantage of both approaches. Data mining can be useful to identify the factors, which are usually not mentioned in termination interviews, such as direct managers. On the other hand, the results obtained from termination interviews can also include features that the authors cannot identify through data mining, which are specifically related to the characteristics of the pharmaceutical industry such as building a more professional career path. From a practical perspective, since this company specializes in pharmaceutical marketing in a new way and is primarily comprised graduates, it is important to note that the churn of specialized people disperses organizational and technological know-how. On the other hand, the pharmacist community in Iran is small, and their attrition might adversely affect not only the reputation of an organization but the employer's brand as well. So, this research would help other similar firms in retaining their valuable human capital.

Details

Benchmarking: An International Journal, vol. 30 no. 10
Type: Research Article
ISSN: 1463-5771

Keywords

1 – 10 of 629