Predicting zombie firms after the COVID-19 pandemic using explainable artificial intelligence

Dongwook Seo (Department of Business and Finance Education, Kongju National University, Gongju, South Korea)
Hyeong Joon Kim (Dongguk Business School, Dongguk University, Seoul, South Korea)
Seongjae Mun (Global Business School, Soonchunhyang University, Asan, South Korea)

Journal of Derivatives and Quantitative Studies: 선물연구

ISSN: 1229-988X

Article publication date: 7 November 2024

Issue publication date: 22 November 2024

370

Abstract

This study examines various artificial intelligence (AI) models for predicting financially distressed firms with poor profitability (“Zombie firms”). In particular, we adopt the Explainable AI (“XAI”) approach to overcome the limitations of the previous AI models, which is well-known as the black-box problem, by utilizing the Local Interpretable Model-agnostic Explanations (LIME) and the Shapley Additive Explanations (SHAP). This XAI approach thus enables us to interpret the prediction results of the AI models. This study focuses on the Korean sample from 2019 to 2023, as it is expected that the COVID-19 pandemic increases the number of zombie firms. We find that the XGBoost model based on a boosting technique has the best predictive performance among several AI models, including the traditional ones (e.g. the logistic regression). In addition, by using the XAI approach, we provide visualized interpretations for the prediction results from the XGBoost model. The analysis further reveals that the return on sales and the selling, general and administrative costs are the most impactful variables for predicting zombie firms. Overall, this study focusing on several AI models not only shows the improvement for the prediction of zombie firms (relative to the traditional models) but also increases the reliability of the prediction results by adopting the XAI approach, providing several implications for market participants, such as financial institutions and investors.

Keywords

Citation

Seo, D., Kim, H.J. and Mun, S. (2024), "Predicting zombie firms after the COVID-19 pandemic using explainable artificial intelligence", Journal of Derivatives and Quantitative Studies: 선물연구, Vol. 32 No. 4, pp. 266-285. https://doi.org/10.1108/JDQS-08-2024-0035

Publisher

:

Emerald Publishing Limited

Copyright © 2024, Dongwook Seo, Hyeong Joon Kim and Seongjae Mun

License

Published in Journal of Derivatives and Quantitative Studies: 선물연구. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence maybe seen at http://creativecommons.org/licences/by/4.0/legalcode


1. Introduction

In recent years, the prediction models using the artificial intelligence (AI) have been ceaselessly proposed. They outperform the classical statistical methods (e.g. linear regression analysis, logistic regression analysis) in terms of predictive power. From the view of finance area, they could be better to detect a firm that might face its troubles soon, such as bankruptcy, compared to the traditional approaches.

However, the conventional AI models have the most critical limitation — the black-box problem — that is, we cannot observe how the prediction results come out. This problem thus prevents their active applications in financial research. Naturally, corporate management, employees of financial institutions (e.g. banks), and policymakers do not possess even a basic understanding of the AI-based prediction mechanism.

To overcome such a critical limitation of AI, we aim to utilize a more recent approach in AI field, which is the eXplainable Artificial Intelligence (hereafter, XAI), and propose a prediction model for the financially-distressed firms with poor profitability (hereafter, zombie firms). To do this, we first perform various AI models (also, the traditional models) and compare their predictive powers for zombie firms. We then focus on the XGBoost algorithm as it has the best performance in our analysis. Next, more importantly, to interpret the prediction results from XAI, we conduct a global analysis of the whole sample and a local analysis of specific data using Local Interpretable Model-agnostic Explanations (LIME) and Shapley Additive Explanations (SHAP) among the XAI models.

In the wake of the COVID-19 pandemic, the economic shock intensified, raising concerns about cascading failures across industries. Kim et al. (2020) argue that the real shocks to both supply and demand caused by COVID-19 could potentially spread to the financial sector, worsening corporate risks. Moreover, these raise concerns over unintended side effects — the prolonged survival of zombie firms — which may arise from large-scale fiscal expenditures intended to support firms’ short-term responses to abrupt economic shocks like the pandemic. The increase in the number of zombie firms in the capital market eventually leads to low economic growth and decreased market competition (Banerjee and Hofmann, 2018; Caballero et al., 2008). Therefore, predicting zombie firms within the economy is essential for devising effective policy measures to mitigate the economic impact of COVID-19.

In this study, we specifically define a zombie firm as “a firm with an interest coverage ratio (earnings before interest and taxes/interest expense) of less than 1 for three consecutive years” following the Bank of Korea (2015). Figure 1 presents the number of zombie and ratio over total firms by year, showing a continuous upward trend in the number of zombie firms since 2019. In 2019, the number of zombie firms among listed companies in South Korea was 331 (16.4% of the total), subsequently increasing sharply to 411 (19.61% of the total) by 2023. This represents a rapid rise of approximately 24%. Consequently, Song et al. (2021a, b) emphasized the urgent need for policy interventions to address this significant surge in the proportion of zombie firms in the Korean financial market. Zombie firms’ reliance on external funding leads to inefficient resource allocation and weakens economic resilience.

Cho et al. (2020) illustrate how zombie firms are often prioritized for exit from financial markets, which subsequently leads to economic recovery based on lower levels of corporate distress during financial crisis. They highlight the necessity of proactively predicting zombie firms to support economic recovery following the COVID-19 pandemic. However, existing research on zombie firms in South Korea is limited to descriptive analyses, focusing on the current status of marginal firms based on debt, assets, employment, industry-specific distribution by size, or the economic impact of marginal firms’ productivity (Song et al., 2021a, b).

According to El Ghoul et al. (2020), the global average proportion of zombie firms increased sharply from 4.6% in 2005 to 8.6% in 2016, with a more rapid rise observed in developed compared to developing countries. Similarly, Banerjee and Hofmann (2022) report that the average proportion of marginal firms in 14 OECD member countries rose from 4% in the mid-1980s to 15% in 2017. This indicates that zombie firms have steadily increased globally during each financial crisis, underscoring the growing importance of research on this issue.

The main results show that, when predicting zombie firms among the Korean sample during 2019–2023, the firm’s net profit margin is the most important determinant, and followed by the ratio of selling, general, and administrative costs to sales (SG&A ratio), gross profit margin, return on equity (ROE), log of revenue, operating income growth rate, net income growth rate, business age, total capital turnover, and debt-to-equity ratio, in the XGBoost model. Note that the XGBoost model is one of the black-box models, and thus, such importance among the variables cannot be observed; however, we overcome this limitation by using LIME and SHAP, (which are XAI models). The local analysis on individual data using LIME and SHAP visually explains why the black-box model (XGBoost) predicts each data point as a zombie firm.

This study has several academic contributions. First, our analysis based on AI techniques shows the better predictions for the likelihood of being zombie firms, compared to the traditional model, such as the logistic regression. Second, we adopt a more recent technique, which is XAI, to overcome the limitations of conventional black-box AI models in the finance literature. Taken together, we finally note that, while the logistic regression indicates which variables in the model are statistically significant in predicting zombie firms, our XAI approach not only outperforms but also illustrates visualizations of the importance of each variable.

Furthermore, this study provides the following practical implications to a wide range of stakeholders, including financial institutions, policymakers, and corporate executives. First, prediction of zombie firms would help financial institutions appropriately manage non-performing loans and credit risks. Second, appropriate policy response strategies can be established through monitoring and risk management by predicting zombie firms at an early stage. Finally, through the early prediction of zombie firms, investors would be provided with accurate information to facilitate better decision-making.

2. Literature review

2.1 Zombie firms

A zombie company is “a firm with an interest coverage ratio (operating income/interest expense) of less than 1 for three consecutive years,” according to The Bank of Korea (2015). Naturally, zombie firms require extensive restructuring, including asset sales, debt restructuring and management innovation. Thus, it is necessary to identify zombie firms accurately when conducting research or establishing policies related to zombie firms or distressed firms (Cho and Ryu, 2007).

Banerjee and Hofmann (2018) attribute the emergence and increasing prevalence of zombie firms worldwide since the 1980s to low interest rates, non-performing bank loans, and inefficient bankruptcy procedures worldwide after the financial crisis. Moreover, they identify that zombie firms distort the allocation of resources across the economy, resulting in a decline in investment and economic slowdown (Banerjee and Hofmann, 2018; Caballero et al., 2008).

Kim and Choi (2017) estimate that a percentage point increase in the proportion of marginal firms within an industry leads to a 0.23% reduction in total factor productivity. In addition, Song (2020a, b) reports that the labor productivity of marginal firms in South Korea is less than half that of normal firms, with inefficiencies caused by new marginal firms having a negative impact on overall labor productivity. Song (2020a, b) also demonstrates that among low-performing firms in South Korea, marginal firms are more likely to remain low performers in the long term, indicating that corporate distress significantly influences firm productivity.

2.2 Predicting zombie firms using traditional models

Prior studies have actively explored the ex-ante prediction of zombie firms or the degree of a firm’s insolvency. For instance, Beaver (1966) identifies the following three basic financial ratios as the most important predictors: liquidity, profitability, and debt-to-equity ratio. Altman’s Z-score (Altman, 1968)–the most widely used technique across all fields of business studies–is a measure to indicate default probability [1]. Ohlson (1980) proposes a measure of default probability called the O-score through a logistic regression model [2]. Merton (1974) introduces a firm’s Distance-to-Default (DD) and the expected default probability (EDF) [3]. More recently, some related studies have attempted to predict zombie or distressed firms in advance by also considering qualitative factors. These include management capabilities, macroeconomic indicators (e.g., GDP, inflation, interest rates, etc.), financial market data (e.g., stock return volatility) and industry trends (Jeong and Kook, 2012; Nam et al., 2008; Park et al., 2019). Kim (2009b) analyzes correlated jump risks on default correlations using Merton's (1974) diffusion model.

However, the traditional method has several limitations. First, predictions based on financial statements focus on historical data; they do not sufficiently reflect rapidly changing market environments or non-financial factors. Second, accurate prediction is difficult because financial metrics can be volatile and can be manipulated during the accounting process. Third, universal prediction models do not adequately account for industry-specific characteristics or the firm’s inherent factors, limiting their ability to reflect the circumstances of individual firms. Finally, traditional financial analysis focuses primarily on short-term performance, and therefore, it is insufficient to predict the sustainability of firms from a long-term perspective. These limitations suggest that more sophisticated and comprehensive prediction models have to be developed. Owing to these limitations, the conventional statistics-based zombie firm prediction models applied in previous studies have shown rather low prediction accuracy.

2.3 Predicting zombie firms using AI

With the advancement of AI technologies, including machine learning and deep learning, studies have attempted to employ AI in zombie and distressed firm prediction research to overcome the limitations of traditional methods. The methods proposed in these studies have outperformed the traditional statistical methods (Yoon, 2019).

In a study by Odom and Sharda (1990), an artificial neural network (ANN) is employed to predict default firms. It outperforms the traditional multiple discriminant analysis method. Lee (1993) compares multiple discriminant analysis, inductive reasoning, and ANN models in the context of prediction of default firms and finds that the ANN method has the highest predictive power. Shin et al. (2005) demonstrate that support vector machines (SVMs) outperform ANNs in predicting default firms. Le and Viviani (2012) also confirm that ANN and SVM are more accurate than traditional statistical methods in predicting bank bankruptcy. Oh and Kim (2017) compare the performance of traditional statistical methods, ANN, and a decision tree model to predict default firms, identifying the decision tree model as the best. Conversely, Lee et al. (2020) applies various machine-learning methods to predict zombie firms and suggests that the K-Nearest Neighborhood method is the best. Kim and Ahn (2016) apply a corporate default prediction method to predict corporate credit ratings and declare that the Random Forest model performs the best.

Recently, a few studies have applied ensemble methods, such as bagging and boosting, and deep learning models to predict distressed firms. An ensemble method refers to the combination of multiple models to achieve better performance than individual classifier models. Generally, it outperforms individual models (Min, 2016). Min (2014) demonstrates that when predicting default risk, the bagging model, among the ensemble methods, has better predictive power than the SVMs applied in previous studies. Kim (2009a) proposes the GM boosting model among boosting ensemble models to address the issue of data imbalance in predicting corporate default; it performs consistently.

Deep learning, a machine learning model, refers to an algorithm that analyzes substantial data and extracts key features by performing machine learning through a structure similar to human neurons based on the ANN theory (Bengio et al., 2013). Many studies have verified that deep learning models outperform traditional models in predicting distressed firms. Kwon et al. (2017) applies a recurrent neural network (RNN) model to predict corporate defaults, showing that the RNN model performs the best. Cha and Kang (2018) apply traditional statistical methods, machine learning, and deep learning models to predict corporate defaults, and they show that the RNN model with the Long Short-Term Memory (LSTM) method outperforms the other models. Hosaka (2019) employs a convolutional neural network (CNN) model to predict corporate defaults and demonstrates that CNN is superior to traditional statistical methods. Vochozka (2020) applies the LSTM method to a deep neural network (DNN) model to predict corporate default and shows that the optimal model can be identified by adjusting the activation function and hyper-parameters. Jo et al. (2021) predicts default firms using data for 2017–2019 and identifies that deep learning models such as RNN-LSTM, RNN-GRU, and CNN show excellent performance based on recall.

However, even though AI models such as machine learning, ensemble methods, and deep learning outperform traditional models, there remains the black-box problem. Particularly, with the advancement of AI-based techniques such as the ensemble methods and deep learning, the complexity of AI models has significantly increased. There are occasions when users are unable to ascertain the reason for obtaining the results produced by AI.

In the financial sector, it is crucial to explain how each variable impacted and contributed to the prediction result. Suppose a financial authority establishes a policy or a financial institution identifies zombie firms and it becomes impossible to explain the problem when it arises ex-post after using the result of the AI model, which is a black box. Then, it will be difficult for policymakers and institutional officials to use the AI model, even if its performance is excellent. Therefore, it is difficult to employ the AI models applied in previous studies to predict distressed firms.

To overcome this limitation, attempts have been made to interpret complex prediction results of AI using XAI. Bae (2023) analyzes the applicability of the XAI method to different industries and shows that XAI can be applied to interpret black-box models for credit rating prediction, loan decision-making, stock index prediction, and exchange rate volatility prediction in the financial sector. Park et al. (2023) applies the integrated gradients model among the XAI models to predict corporate defaults by analyzing the data of listed companies for the period 2001–2021. Therefore, this study applies LIME and SHAP among the XAI models to interpret the prediction results of the black-box model and determine the importance of each variable.

3. Theoretical background

3.1 Concept of explainable AI

To use AI more actively, users must be able to review and understand ex-post how the AI arrived at a decision. XAI refers to AI that can conveniently express the prediction result of an AI model, which is a black box, in a manner intelligible to humans (Gunning, 2016). XAI can provide humans with a rationale for the results derived using AI, enabling humans to correct AI models that make incorrect decisions and improve AI models by examining the rationale behind the decisions (Adadi and Berrada, 2018).

A few models, such as traditional logistic regression, simplify the decision-making process for humans; however, simple, intelligible models have the disadvantage of poor performance. In other words, there is an inverse relationship between AI’s complexity and its performance (Gunning, 2016). Therefore, in recent years, researchers have focused on XAI that produces intelligible results while maintaining high performance (Adadi and Berrada, 2018). Such research can enhance the transparency and reliability of AI models and expand the possibilities of using AI in key areas, such as finance.

XAI determines the importance of variables in the input data to the prediction made by AI. Among the various methods of assessing the contribution of variables, a surrogate model refers to training a complex AI model with an easier surrogate model that users can understand. This method is used mainly in the field of XAI. For example, when a linear regression model or a decision tree model is applied as a surrogate model, it can be conveniently trained because of its simple structure, and the results are intelligible for humans. This enables the interpretation of the prediction result of a complex model and the explicit determination of each variable’s contribution.

There are two types of analysis methods that use surrogate models: global surrogate analysis and local surrogate analysis. The former approximates a complex model using all the input data to interpret the importance of a variable on average, and the latter approximates a complex model for only specific input data to interpret the value of a variable for judging that data. The global analysis method employs the entire dataset to approximate a complex model with a simple model, which can lead to an overfitting problem for a given dataset (Molnar, 2020). The local analysis method utilizes data around a specific data point to approximate a complex model, which can mitigate the overfitting problem, providing a more sophisticated interpretation. These methods are valuable tools for understanding and interpreting the prediction results of complex AI models.

3.2 Explainable AI: LIME

LIME—a widely applied local analysis method—facilitates assessing the contribution of variables to prediction results, regardless of the AI model type. According to Ribeiro et al. (2016), deriving an XAI model through LIME can be viewed as an optimization problem. They define it as follows. f:RdR is defined as an AI model, a black box that predicts data; G as the set of explainable surrogate models; g:XR as an explainable surrogate mode, which is an element of the set G; Ω(g) as the complexity of the model; πx(z) as the similarity between data x and data z; and L(f,g,πx) as a loss function that measures the difference (loss) between the black-box AI model f and the surrogate model g. Here, the optimization problem is to derive an explanation ε(x) that minimizes the loss function L(f,g,πx) while maintaining the complexity Ω(g) of the surrogate model g at a level that the user can understand, which is summarized in Eq. (1).

Eq. (1)ε(x)=argmingGL(f,g,πx)+Ω(g)
In the equation, we intend to minimize the loss function without making any assumptions about the AI model f regardless of the type of conventional AI models (model agnostic). Here, let z{0,1}d be the variant data uniformly sampled in the vicinity for x{0,1}d, and convert z into the data z to be input to the AI model by assigning a weight by its similarity to the existing samples. Then, the loss function L(f,g,πx) is defined and calculated as Eq. (2).
Eq. (2)L(f,g,πx)=z,zπx(z)(f(z)g(z))2

Then, after performing a LASSO fit to select a specific variable considering the optimization problem, a local analysis of the prediction of the black-box model can be performed by finally training the model with a coefficient that can be explained via a simple surrogate model, such as linear regression.

3.3 Explainable AI: SHAP

SHAP (SHapley Additive exPlanation) is an XAI model developed based on the theory proposed Lloyd Stowell Shapley—an American mathematician and economist (Lundberg and Lee, 2017). SHAP applies a method that adds Shapley values to express the importance of each variable used by AI in an intelligible linear model. A Shapley value, which is based on the game theory, is a numerical representation of the degree of contribution of each variable to the prediction result. The degree of the contribution of each variable is the degree to which the prediction result changes when the variable is excluded in Eq. (3).

Eq. (3)1=E[f(z)]E[f(z)|z1=x1]

In Figure 2, 0, 1, 2, and 3 are marked with blue arrows, indicating that these variables have a positive (+) impact on the prediction result of AI, and 4 is marked with a red arrow, indicating that it has a negative (−) impact. This implies that SHAP enables verifying why AI identified a zombie firm based on a single piece of data.

In this study, LIME and SHAP, which enable global analysis and local analysis, respectively, are adopted as XAI models and applied to interpret the results predicted by machine learning and deep learning models, which are black-box models. Based on this, we analyze the characteristics of zombie firms that have been increasing in number after the COVID-19 pandemic and identify the key characteristics for predicting zombie firms. Thus, the significance of this study lies not in comparing the determinants before and after the COVID-19 pandemic, but in addressing the limitations of traditional AI models—specifically their lack of interpretability (the black-box problem)—by employing XAI to enhance model transparency.

4. Data and method

4.1 Data collection

The financial accounting data utilized in this study were collected through FnGuide’s DataGuide and based on the International Financial Reporting Standards (IFRS) consolidated financial statement standards. The purpose of this study is to predict zombie firms based on firm characteristic data. Therefore, we exclude the data from the financial industry (“K” based on the main categories of the 10th version of the Korean Standard Industrial Classification Code (KSIC)), which have different characteristics from general manufacturing firms, and data from the electricity, gas, steam, and air-conditioning supply industry (“D” based on the main categories of the 10th version of the KSIC), which are not suitable for financial analysis because they are managed by the government. Moreover, the analysis is restricted to firms with fiscal years ending in December, and samples with zero sales are excluded. For all variables of firm characteristics, we exclude the data with missing values from the analysis and apply winsorization to the top and bottom 1% of values to control outliers.

4.2 Sample selection

We focus on the Korean sample after the COVID-19 pandemic as corporate debts in Korea have been rising sharply and the number of zombie firms has grown rapidly, particularly in the post-COVID-19 period [4]. In the early days of the pandemic, large-scale lockdowns and social distancing restricted consumers’ movement. Consequently, demand plummeted, especially in service industries such as travel, hospitality, and food and beverage, leading to a decline in sales. Furthermore, the supply of raw materials and components was disrupted because of the disorder in global supply chains, severely hampering production and sales activities. Many companies took out loans to cope with the sales decline, which significantly increased their debt burden.

At the same time, governments worldwide including Korea had reduced their interest rates and aggressively provided liquidity during the COVID-19 pandemic to stimulate the economy. However, recently (typically, from 2022 onward), central banks have started raising interest rates because of high inflation. The sharp rise in base interest rates has increased the cost of capital financing for companies. In addition, the decline in consumer purchasing power and rise in commodity prices have decreased companies’ profitability, reducing their ability to repay their debts. The rising cost of capital due to interest rates have threatened the sustainability of companies.

In short, the sample period of this study is from 2019 to 2023, and the sample consists of the KOSPI and KOSDAQ firms. Listed companies are required to disclose their financial statements, which provide highly reliable, externally audited corporate data. As listed companies account for a significant share of the South Korean economy in terms of sales and number of employees, they are often used in research related to the economy—including in this study. The total data by company and year amounts to 10,353, of which 1,851 are for zombie firms and 8,502 are for normal firms. Thus, approximately 17.8% of the companies are zombie firms, as similarly described in Figure 1.

4.3 Firm characteristic variables

In this study, the dependent variable is “zombie firm status,” which takes the value of 1 for zombie firms with an interest coverage ratio of less than 1 for three consecutive years, and the value of 0 for normal firms. As for the firm characteristic variables used in the model for predicting distressed firms and defaults, we use the following variables, which are used in the Z-score by Altman (1968): growth (total asset growth rate, current asset growth rate, sales growth rate, net income growth rate, and operating income growth rate), profitability (net profit margin, gross profit margin, and return on equity), activity (accounts receivable turnover ratio, inventory turnover ratio, total capital turnover ratio, tangible asset turnover ratio, ratio of cost of sales to sales, ratio of selling, general, and administrative costs to sales), stability (debt-to-equity ratio, current ratio, equity to total asset ratio, quick ratio, fixed assets to net worth ratio, net working capital ratio, leverage level, and cash ratio), and firm characteristics (log of revenue, log of total assets, and business age). Table 1 presents the definition of each variable.

4.4 Preprocessing for AI models

In this study, the data are divided into training set and test set in a 7:3 ratio. Thereafter, the model is estimated using the training set, and the estimated model is evaluated using the test set. Both the training and test sets are imbalanced data with fewer zombie firms than normal firms. In the analysis, we apply the balancing method only to the training set to estimate the zombie firm prediction model and test the model without applying the balancing method by considering the actual distribution of zombie firms in the test set. As previous studies emphasize that the model should be tested using a dataset with the same distribution as the real world, we followed this method to address the imbalanced data issue (Zhou, 2013; Song et al., 2021a, b).

This study aims to predict the rapidly increasing number of zombie firms following COVID-19 more accurately and propose a methodology for interpreting the AI model’s predictions. Thus, we assume that the factors identifying zombie firms have changed pre- and post-COVID-19. Accordingly, we deem it more appropriate to split and train/evaluate the post-COVID-19 data, rather than training on pre-COVID-19 data and then predicting post-COVID-19 outcomes.

While it is possible to train the AI model using the entire post-COVID-19 dataset, this can lead to overfitting. Overfitting occurs when a model becomes overly specialized to the training data, resulting in excellent performance on the training data but poor generalizability to new data (Kuhn et al., 2013). To address this, AI research typically divides the dataset into a training set and a test set, often using a 7:3 or 8:2 split, for training and test, respectively. Specifically, the data, including both zombie firms and normal firms, are split randomly from the entire dataset. This approach mitigates overfitting and ensures that the AI model can maintain stable performance even with additional data in the future. Accordingly, this study adopts the same methodology.

To address the imbalance issue, we apply the Synthetic Minority Oversampling Technique (SMOTE)—an oversampling method that synthesizes low-frequency data. SMOTE uses the K-Nearest Neighbor (KNN) technique to extract other zombie firm data around the low-frequency zombie firm data and generates new zombie firm data via random linear combinations.

The zombie firm prediction models estimated with the training set include the following: traditional prediction models such as Logistic Regression (LR), Linear Discriminant Analysis (LDA), and KNN, as well as Decision Tree (DT), Random Forrest (RF), SVM, and tree-based boosting methods such as AdaBoost, LightGBM, and XGBoost. Additionally, among the deep learning methods, we employ Multi-Layer Perceptron (MLP), DNN, CNN, RNN, and LSTM-applied RNN. In summary, we verify the method that shows the best performance among the 14 models applied in previous studies. Finally, XGBoost—a tree-based boosting model—shows the best performance. The results predicted by this model are utilized for the XAI methods of LIME and SHAP.

4.5 Evaluation metrics

We apply the 14 models selected from previous studies to predict the zombie firms in the test set and compare them with the actual zombie firm status. The confusion matrix in Table 2 shows the four evaluation metrics, which are used as typical metrics to compare the predicted and actual values based on the probability value between 0 and 1—the thresholds for the zombie firm status. True positive (TP) refers to the case where the model accurately predicts that an actual zombie firm is a zombie firm. False negative (FN) refers to the case where the model inaccurately predicts that an actual zombie firm is a normal firm. False positive (FP) refers to the case where the model inaccurately predicts that an actual zombie firm is a normal firm. True negative (TN) refers to the case where the model accurately predicts that an actual normal firm is a normal firm.

For the comparison between models, we employed metrics: accuracy, precision, recall, and F1 score. Table 3 demonstrates the evaluation metrics and their formulas. Accuracy is the ratio of actual zombie firms (TP) and actual normal firms (TN) predicted by the model to total data. Precision is the ratio of actual zombie firms (TP) predicted by the model to those predicted as zombie firms (TP + FP). Recall is the ratio of zombie firms predicted by the model to actual zombie firms. F1 score is the harmonic mean of precision and recall, and a high F1 score indicates that the model has a high ability to predict zombie firms.

5. Main results

5.1 Prediction for zombie firms

The performances of the 14 models applied to predict zombie firms in previous studies are evaluated using accuracy, precision, recall, and F1 scores. Each model employs the same training and test sets. The comparison of evaluation metrics among the models in Table 4 shows the results of the performance comparison. The comparison reveals that the XGBoost model has the highest accuracy and F1 score. XGBoost is an ensemble method based on boosting, which is a type of tree model. It is widely applied owing to its high predictive performance and excellent learning speed (Chen and Guestrin, 2016). Therefore, from here on, we perform the analysis with the results predicted using XGBoost, which has the best performance, for the interpretation based on XAI.

Jo et al. (2021) predict zombie firms through machine learning and deep learning using characteristics data of firms and evaluate the performance of AI models in terms of accuracy, precision, recall, and F1 scores. These performance evaluation measures are crucial because they can indicate the reliability of the prediction results; however, they are not informative about the variables that contributed to the AI’s prediction, and by how much. Moreover, even if the AI model predicts a data point as a zombie firm, the reason remains unknown.

Therefore, in this study, we compare the 14 models applied in previous studies not only to predict zombie firms, but also to design additional XAI models and interpret the zombie firm prediction results. This study adopts LIME and SHAP among the various XAI models to apply both global and local analysis methods. The global analysis method helps to identify the variables that are important for predicting zombie firms for the entire data. The local analysis method helps to interpret the reasons for judging individual data as zombie firms.

With LIME, we can calculate the contribution of variables to the prediction result, regardless of AI model type. To analyze the relationship between independent and dependent variables in the input data, SHAP can identify the significance of each independent variable while excluding the independent variable of interest, and linearly combine them to approximate the result produced by the AI model, thereby determining the importance of the variable. For the analysis based on the XAI models, we utilized the original data of the zombie and normal firms without using the SMOTE technique, which is an oversampling method.

5.2 Feature attribution analysis through LIME

First, a global analysis is performed on the entire data to determine the variables that affect the zombie firm status, which is a dependent variable. Figure 3 provides a bar plot of the results of the global analysis using LIME. Using the LIME model, we are able to analyze the average feature importance for all 10,353 yearly corporate data. Through the analysis, we identify the characteristics of zombie firms. The average importance is calculated for all samples, with higher values indicating that the pertinent variable played a consistently important role in the model’s predictions.

Based on the average feature importance calculated through LIME, that the most important variable is net profit margin, followed by the ratio of SG&A to sales, gross profit margin, return on equity of revenue, operating income growth rate, net income growth rate, business age, total capital turnover rate, and debt-to-equity ratio. The most important factor in predicting zombie firms is the net profit margin, and when the net profit margin is lower than −4.53%, the probability of determining the pertinent firm as a zombie firm is high. The second most important factor is the ratio of SG&A to sales, and when it is 33.66% or higher, the probability of determining the pertinent firm as a zombie firm is high.

Next, we perform a local analysis to interpret the impact of each variable on predicting a zombie firm with individual data. One of the firms in the sample is indeed a zombie firm, and the XGBoost model predicts that it can be a zombie firm with 97% probability. To check the interpretation for the prediction result, we provide a visual representation of the contribution of the variables to the prediction result obtained for the pertinent firm using LIME in Figure 4.

In local analysis, the model’s response to small changes around specific data is analyzed to identify important variables and their contributions. In the results shown in Figure 4, the orange-colored parts indicate each variable’s contribution to predicting zombie firms, while the blue-colored parts indicate each variable’s contribution to predicting normal firms. The black-box model, XGBoost, predicts the pertinent data as zombie firms with 97% probability, and the most important variable is net profit margin, followed by gross profit margin, log of revenue, return on equity, debt-to-equity ratio, current asset growth rate, and sales growth rate. Particularly, the most important factor influencing the prediction of zombie firms is the net profit margin, which is −26.62%, because it is lower than the global analysis threshold of −4.53%. The second most important factor is the gross profit margin, which is −1.31%, because it is lower than the global analysis threshold of 11.63%.

5.3 Feature attribution analysis through SHAP

This study also uses the SHAP model to analyze the prediction results of XGBoost, a black box. First, we analyze the importance of each variable employed to identify zombie firms in the entire data based on a global method using the SHAP model. The force plot in Figure 5 shows the average of the absolute values of the Shapley values of each independent variable across the entire data. It shows the top-10 variables. The size of the bar indicates the absolute importance and does not specify the sign of the influence (positive or negative). The analysis shows that the most important variable for identifying zombie firms is net profit margin, followed by the ratio of SG&A to sales, gross profit margin, log of revenue, return on equity, net income growth rate, operating income growth rate, debt-to-equity ratio, leverage level, and business age.

To examine the importance and direction of influence for each variable in detail, we perform an analysis using SHAP’s summary plot. The vertical axis in Figure 6 shows the importance of the variables in sequential order, while the horizontal axis represents the distribution of the Shapley values. For example, the points on the graph of net profit margin, which is the most important variable, represent the Shapley value of each data, and points having the same value are shown overlapping. As for the color of the dots, blue indicates the smallest value of the normalized variable, and red indicates the largest value. Thus, the distribution of Shapley values and the color of the dots provide a visual representation of the actual value and importance of each variable. A positive Shapley value implies that the value of the independent variable acted as a positive factor when XGBoost predicted zombie firms. A negative Shapley value implies that it was a negative factor. For example, in Figure 6, the lower the net profit margin, the lower the Shapley value, which is, thus, shown in red, indicating that these firms are more likely to be identified as zombie firms. Conversely, the higher the ratio of SG&A to sales, the higher the Shapley value—that is, the higher the probability that the pertinent firms will be identified as zombie firms.

Next, we perform a local analysis to interpret the impact of each variable on predicting a zombie firm with individual data using the SHAP model. Figure 7 shows the result of using the SHAP model to interpret the specific data predicted as zombie firms by the XGBoost model. The base value in the vertical line in the figure is the value on which the judgment is based. The score f(xA) by the XGBoost model for this data is 3.64, which is greater than the base value; so, it is predicted to be a zombie firm. The red shapes below the vertical line represent the Shapley values, which are the influence of the independent variables. In this data, the independent variable with the largest Shapley value is “net profit margin,” which is the same as the result of the global analysis. This company has a net profit margin of −26.62%, which implies that the AI model has predicted it as a zombie firm based on this data. The blue shapes, such as those representing the debt-to-equity ratio in Figure 7, show the Shapely values of the independent variables that decrease the value of f(xA).

Interpretations based on XAI have several advantages over those in traditional AI research. The first key advantage is the ability to identify the relative importance of each variable in Global Analysis. For instance, while logistic regression can indicate which variables are statistically significant in predicting zombie firms, it is difficult to compare the relative importance of each variable. In contrast, XAI models provide visualizations of the importance of each variable, making it easier for users to grasp. Additionally, due to their nature, tree-based AI classifiers can provide thresholds for identifying zombie firms. For example, Figure 3 not only shows that Net Profit Margin is the most important variable, but also provides a threshold (Net Profit Margin = −4.53%), below which the likelihood of being classified as a zombie firm increases.

The second key advantage is the ability to explain why a sample is classified as a zombie firm by using the specific variables and thresholds for that particular data in Local Analysis. For example, in Figure 4, the sample predicted as a zombie firm has a Net Profit Margin of −26.62%, which is below the threshold of −4.53%, thereby providing a detailed explanation for its classification as a zombie firm.

Although prior research on predicting zombie firms before COVID-19 focused primarily on growth, profitability, activity, and stability, Song et al. (2021a, b) shows that in the context of the Korean economy, firm age and size are also significant variables. In this study, XAI analysis indicates that in addition to the traditional findings in the financial field—low profitability and high dependence on external capital—firm age and smaller size also contribute to firms experiencing greater shocks in the post-COVID-19 period in Korea. The relationship between firm age and zombie firms is consistent with the findings of Dai et al. (2019) and McGowan et al. (2018), while the relationship between firm size and marginal firms aligns with the results of Banerjee and Hofmann (2022) and El Ghoul et al. (2020). Therefore, this suggests the need for prioritizing the investigation and management of long-established zombie and small-scale firms when predicting zombie firms in the context of post-COVID-19 policymaking.

Based on the analysis using the LIME and SHAP models, we examine and interpret the reasoning behind how XGBoost—a high-performing but black-box model—predicts each data point as a zombie firm. We identify that the most important variables for predicting post-COVID-19 zombie firms are net profit margin, gross profit margin, and the ratio of SG&A to sales. This implies that the deterioration of the business environment of zombie firms after the COVID-19 pandemic led to a decrease in gross as well as net profit margins, and an increase in SG&A expenses, resulting in higher costs.

6. Conclusion

This study compares 14 AI algorithms from previous studies and uses the best-performing XGBoost model to predict post-COVID-19 zombie firms. Using accuracy, precision, recall, and F1 scores as evaluation metrics, we identify that XGBoost has an accuracy of 0.913 and an F1 score of 0.738. This indicates its superiority over traditional logistic regression models, machine learning, and deep learning models.

Furthermore, using the prediction results of XGBoost, which shows excellent performance but is a black-box model that cannot be interpreted, we propose a method to develop and interpret a post-COVID-19 zombie firm prediction model using SHAP and LIME among the XAI methods. First, we analyze the average importance of each independent variable through a global analysis using the XAI model on the entire data. The analysis shows that the most important variable for AI to predict zombie firms after the COVID-19 pandemic is net profit margin, followed by the ratio of SG&A to sales, gross profit margin, return on equity, log of revenue, operating income growth rate, net income growth rate, business age, total capital turnover rate, and debt-to-equity ratio. Furthermore, based on the local analysis, which interprets the prediction results of individual data using an XAI model, we have interpreted why the XGBoost model identified certain data as zombie firms, separated the variables that have positive and negative impacts on the AI’s prediction, and confirmed their impacts.

This study presents an AI algorithm that can predict zombie firms more accurately than traditional methods (e.g. logistic regression). Moreover, it provides interpretations of the prediction results, which traditional AI research has been unable to offer. Based on these interpretations, we make the following policy recommendations. First, when identifying and supporting zombie firms post-COVID-19, it is important to conduct a more in-depth selection process based on firm size. Additionally, for long-established firms, which may face difficulties in business transformation or structural improvement, the focus should be on policies targeting management innovation and restructuring. Furthermore, measures should be implemented to enhance the capital structure of these firms, such as improving operational profitability and adjusting debt ratios to increase financial stability.

By using XAI models, this study has improved the interpretability of AI models that cannot be understood by humans. Our method is expected to enable financial institutions and investors make better-informed decisions. Most of all, this study suggests which factors to focus on to predict zombie firms in advance, while more accurately predicting zombie firms that may cause the entire South Korean economy to decline after the COVID-19 era.

Figures

Number of zombie firms by year

Figure 1

Number of zombie firms by year

Example of decomposing Shapley values

Figure 2

Example of decomposing Shapley values

Global analysis results based on LIME

Figure 3

Global analysis results based on LIME

Local analysis results based on LIME

Figure 4

Local analysis results based on LIME

Global analysis results (force plot) based on SHAP

Figure 5

Global analysis results (force plot) based on SHAP

Global analysis results (summary plot) based on SHAP

Figure 6

Global analysis results (summary plot) based on SHAP

Local analysis results (force plot) based on SHAP

Figure 7

Local analysis results (force plot) based on SHAP

Firm characteristic variables used

CategoryVariable nameDescription
GrowthTotal asset growth rate(Total assets – Prior-term total assets)/Prior-term total assets × 100
Current asset growth rate(Current assets – Prior-term current assets)/Prior-term current assets × 100
Sales growth rate(Current period sales – Prior period sales)/Prior period sales × 100
Net income growth rate(Current period net income – Prior period net income)/Prior period net income × 100
Operating income growth rate(Current period operating income – Prior period operating income)/Prior period operating income × 100
ProfitabilityNet profit marginNet profit/Revenue × 100
Gross profit marginGross profit/Revenue × 100
Return on equityNet income/Shareholder’s equity × 100
ActivityAccounts receivable turnover ratioSales/Accounts receivable
Inventory turnover ratioCost of sales/inventory assets
Total capital turnover ratioSales/Total capital
Tangible asset turnover ratioSales/Total assets
Ratio of cost of sales to salesCost of sales/Sales × 100
Ratio of selling, general, and administrative costs (SG&A) to salesSG&A/Sales × 100
StabilityDebt-to-equity ratioDebt/Shareholder’s equity × 100
Current ratiocurrent assets/current liabilities × 100
Equity to total asset ratioEquity/Total asset × 100
Quick ratioQuick assets/current liabilities × 100
Fixed assets to net worth ratioFixed assets/Total equity × 100
Net working capital ratioNet working capital/total equity × 100
Leverage level(Current and non-current loans + bonds)/total equity × 100
Cash ratioCash and cash equivalents/Current liabilities × 100
Firm characteristicsLn(sales)Natural log value of sales
Ln(total assets)Natural log value of total assets
Business ageNumber of years since founded

Source(s): Table by authors

Evaluation metrics: confusion matrix

ActualPredicted
True (positive)False (negative)
True (positive)True Positive (TP)False Negative (FN)
False (negative)False Positive (FP)True Negative (TN)

Source(s): Table by authors

Evaluation metrics and formulas

Evaluation metricFormula
AccuracyTP+TNTP+TN+FP+FN
PrecisionTPTP+FP
RecallTPTP+FN
F1 score2(Recall×Precision)(Recall+Precision)

Source(s): Table by authors

Comparison of evaluation metrics by model

ModelAccuracyPrecisionRecallF1 score
XGBoost0.9130.7670.7110.738
LightGBM0.9130.7740.6950.733
RF0.9120.7830.6730.724
RNN0.8980.7330.6610.695
DT0.8960.7130.6860.699
LSTM0.8960.7330.6450.686
AdaBoost0.8950.7320.6420.684
CNN0.8940.7210.6500.684
SVM0.8910.7820.5300.632
LR0.8800.7660.4640.578
MLP0.8790.6690.6230.645
LDA0.8750.7410.4540.563
DNN0.8620.6250.5520.586
KNN0.8570.8570.2300.362

Note(s): This table presents the comparison of evaluation metrics by 14 AI models. Detailed explanations of Accuracy, Precision, Recall, and F1 Score are described on Table 3

Source(s): Table by authors

Notes

1.

The specific Z-score is calculated as follows.

Altman Z=1.2X1+1.4X2+3.3X3+0.6X4+0.99X5

where, X1 = working capital/total assets, X2 = retained earnings/total assets, X3 = EBIT/total assets, X4 = market value of total equity/total liabilities, and X5 = sales/total assets.

2.

The specific O-score is calculated as follows.

Ohlson O=1.320.407×SIZE+6.03×TLTA1.43×WCTA+0.0757×CLCA1.72×OENEG2.37×NITA1.83×FUTL+0.285×INTWO0.521×CHIN

where, SIZE = log(total assets/GNP price-level index), TLTA = total liabilities/total assets, WCTA = working capital/total assets, CLCA = current liabilities/current assets, OENEG = 1 if total liabilities > total assets; 0 if otherwise, NITA = net income/total assets, FUTL = operating cash flow/total liabilities, INTWO = 1 if net income <0 in the last two years; 0 if otherwise, and CHIN = NItNIt1|NIt|+|NIt1| (NI = net income).

3.

Assuming that the value of a firm’s equity is a kind of call option, Merton (1974) defines default as the situation where the value of equity is less than the value of debt at the debt maturity date. According to the Merton model, the expected default probability (EDF) is defined as EDF = N(-DD) based on the standard normal cumulative distribution, where the firm’s Distance-to-Default (DD) is calculated as follows. Bharath and Shumway (2008) constructs the KMV-Merton model by enhancing the traditional Merton model with Moody’s KMV model. DD=ln(VF)+(μ0.5σ2)TσT

where, V = the market value of the firm’s assets, F = the market value of its liabilities, μ = the expected return on assets, σ = the volatility of asset values, and T = the maturity of the liabilities.

4.

Ryu et al. (2024) report that corporate debt in South Korea reached 2,734 trillion won at the end of 2023, an increase of 1,036 trillion won since 2018. They note that this growth rate, averaging 8.3%, significantly exceeds the nominal growth rate of 3.4%, indicating a very rapid increase. They also argue that the impact of COVID-19 has further accelerated the rise in corporate debt levels.

References

Adadi, A. and Berrada, M. (2018), “Peeking inside the black-box: a survey on explainable artificial intelligence (XAI)”, IEEE Access, Vol. 6, pp. 52138-52160, doi: 10.1109/access.2018.2870052.

Altman, E.I. (1968), “Financial ratios, discriminant analysis and the prediction of corporate bankruptcy”, The Journal of Finance, Vol. 23 No. 4, pp. 589-609, doi: 10.1111/j.1540-6261.1968.tb00843.x.

Bae, J.K. (2023), “A study on the applicability of eXplainable artificial intelligence (XAI) methodology by industrial district”, International Business Education Review, Vol. 20 No. 2, pp. 195-208, doi: 10.38115/asgba.2023.20.2.195.

Banerjee, R. and Hofmann, B. (2018), “The rise of zombie firms: causes and consequences”, BIS Quarterly Review September.

Banerjee, R. and Hofmann, B. (2022), “Corporate zombies: anatomy and life cycle”, Economic Policy, Vol. 37 No. 112, pp. 757-803, doi: 10.1093/epolic/eiac027.

Beaver, W.H. (1966), “Financial ratios as predictors of failure”, Journal of Accounting Research, Vol. 4, pp. 71-111, doi: 10.2307/2490171.

Bengio, Y., Courville, A. and Vincent, P. (2013), “Representation learning: a review and new perspectives”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35 No. 8, pp. 1798-1828, doi: 10.1109/tpami.2013.50.

Bharath, S.T. and Shumway, T. (2008), “Forecasting default with the Merton distance to default model”, Review of Financial Studies, Vol. 21 No. 3, pp. 1339-1369, doi: 10.1093/rfs/hhn044.

Caballero, R.J., Hoshi, T. and Kashyap, A.K. (2008), “Zombie lending and depression restructuring in Japan”, The American Economic Review, Vol. 98 No. 5, pp. 1943-1977, doi: 10.1257/aer.98.5.1943.

Cha, S.J. and Kang, J.S. (2018), “Corporate default prediction model using deep learning time series algorithm, RNN and LSTM”, Journal of Intelligence and Information Systems, Vol. 24 No. 4, pp. 1-32.

Chen, T. and Guestrin, C. (2016), “Xgboost: a scalable tree boosting system”, Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, Vol. 11, pp. 785-794, doi: 10.1145/2939672.2939785.

Cho, S.P. and Ryu, I.K. (2007), “Accounting information and prediction of corporate failure during a recession”, Journal of Business Research, Vol. 22 No. 1, pp. 1-32, doi: 10.22903/jbr.2007.22.1.1.

Cho, J.H., Song, D.B. and Kim, I.C. (2020), “Policy tasks for economic recovery after COVID-19–analysis of productivity changes in domestic companies before and after the 2009 global financial crisis and implications”, Korea Institute for Industrial Economics and Trade, i-KIET Industrial Economics Issue, Vol. 86.

Dai, X., Qiao, X. and Song, L. (2019), “Zombie firms in China's coal mining sector: identification, transition determinants and policy implications”, Resources Policy, Vol. 62, pp. 664-673, doi: 10.1016/j.resourpol.2018.11.016.

El Ghoul, S., Fu, Z. and Guedhami, O. (2020), “Zombie firms: prevalence, determinants, and corporate policies”, Finance Research Letters, Vol. 41, 101876, doi: 10.1016/j.frl.2020.101876.

Gunning, D. (2016), “Broad agency announcement explainable artificial intelligence (XAI)”, Defense Advanced Research Projects Agency (DARPA), Tech. Rep.

Hosaka, T. (2019), “Bankruptcy prediction using imaged financial ratios and convolutional neural networks”, Expert Systems with Applications, Vol. 117, pp. 287-299, doi: 10.1016/j.eswa.2018.09.039.

Jeong, W.H. and Kook, C.P. (2012), “Stock Return Volatility and Corporate Credit Risk”, Journal of Derivatives and Quantitative Studies, Vol. 20 No. 1, pp. 1-40.

Jo, J.H., Ahn, E.J. and Kim, S.S. (2021), “A study on the prediction model for insolvent companies based on deep learning”, Journal of Business Research, Vol. 36 No. 1, pp. 99-113.

Kim, M.J. (2009a), “Ensemble learning for solving data imbalance in bankruptcy prediction”, Journal of Intelligence and Information Systems, Vol. 15 No. 3, pp. 1-15.

Kim, H.S. (2009b), “The effects of correlated jump risks on default correlation”, Journal of Derivatives and Quantitative Studies, Vol. 17 No. 1, pp. 1-20, doi: 10.1108/JDQS-01-2009-B0001.

Kim, S.J. and Ahn, H.C. (2016), “Application of random forests to corporate credit rating prediction”, Journal of Business Economics, Vol. 32 No. 1, pp. 187-211.

Kim, W.K. and Choi, H.K. (2017), Increasing Proportion of Marginal Companies and Slowing Productivity, Korea Institute for Industrial Economics & Trade, i-KIET Industrial Economics Issue, pp. 2017-2022.

Kim, G.L., Kwak, H.K. and Choi, Y.K. (2020), “Economic alarm bells: debt risks and the Corona shock”, Samjong KPMG ERI COVID-19 Business Report, available at: https://assets.kpmg.com/content/dam/kpmg/kr/pdf/2020/kr-covid-19-debt-crisis_20200713.pdf

Kuhn, M., Johnson, K., Kuhn, M. and Johnson, K. (2013), “Over-fitting and model tuning”, Applied Predictive Modeling, pp. 61-92, doi: 10.1007/978-1-4614-6849-3_4.

Kwon, H.K., Lee, D.K. and Shin, M.S. (2017), “Dynamic forecasts of bankruptcy with recurrent neural network model”, Journal of Intelligence and Information Systems, Vol. 23 No. 3, pp. 139-153.

Le, H.H. and Viviani, J.L. (2012), “Predicting bank failure: an improvement by implementing a machine-learning approach to classical financial ratios”, Research in International Business and Finance, Vol. 44, pp. 16-25, doi: 10.1016/j.ribaf.2017.07.104.

Lee, K.C. (1993), “A comparative study on the bankruptcy prediction power of statistical model and AI models: MDA, Inductive, Neural Network”, Journal of the Korean Operations Research and Management Science Society, Vol. 18 No. 2, pp. 57-81.

Lee, C.H., Choi, J.H., Kim, M.S., Choi, J.H. and Sung, T.E. (2020), “A study on the preemptive prediction model of marginal companies for restructuring innovation”, Journal of Korea Technology Innovation Society, Vol. 23 No. 4, pp. 637-667, doi: 10.35978/jktis.2020.8.23.4.637.

Lundberg, S.M. and Lee, S.I. (2017), “A unified approach to interpreting model predictions”, Advances in Neural Information Processing Systems, Vol. 30.

McGowan, M.A., Andrews, D. and Millot, V. (2018), “The Walking Dead? Zombie firms and productivity performance in OECD countries”, Economic Policy, Vol. 33 No. 96, pp. 685-736, doi: 10.1093/epolic/eiy012.

Merton, R.C. (1974), “On the pricing of corporate debt: the risk structure of interest rates”, The Journal of Finance, Vol. 29 No. 2, pp. 449-470, doi: 10.1111/j.1540-6261.1974.tb03058.x.

Min, S.H. (2014), “Bankruptcy prediction using an improved bagging ensemble”, Journal of Intelligence and Information Systems, Vol. 20 No. 4, pp. 121-139, doi: 10.13088/jiis.2014.20.4.121.

Min, S.H. (2016), “Investigating dynamic mutation process of issues using unstructured text analysis”, Journal of Intelligence and Information Systems, Vol. 22 No. 1, pp. 139-157, doi: 10.13088/jiis.2016.22.1.139.

Molnar, C. (2020), Interpretable Machine Learning, Lulu. com.

Nam, C.W., Kim, T.S., Park, N.J. and Lee, H.K. (2008), “Bankruptcy prediction using a discrete-time duration model incorporating temporal and macroeconomic dependencies”, Journal of Forecasting, Vol. 27 No. 6, pp. 493-506, doi: 10.1002/for.985.

Odom, M.D. and Sharda, R. (1990), “A neural network model for bankruptcy prediction”, 1990 IJCNN International Joint Conference on Neural Networks, pp. 163-168.

Oh, W.S. and Kim, J.H. (2017), “Forecasting corporate bankruptcy with artificial intelligence”, Journal of Industrial Convergence, Vol. 15 No. 1, pp. 17-32.

Ohlson, J.A. (1980), “Financial ratios and the probabilistic prediction of bankruptcy”, Journal of Accounting Research, Vol. 18 No. 1, pp. 109-131, doi: 10.2307/2490395.

Park, J.J., Hong, J. and Na, S. (2019), “Decomposition of the Business Cycle Shock and the Default Rate of SMEs in Korea”, Journal of Derivatives and Quantitative Studies, Vol. 27 No. 4, pp. 401-423, doi: 10.1108/JDQS-04-2019-B0002.

Park, J.H., Kim, G.Y., Ju, J.H., Lee, H. and Choi, H.J. (2023), “Factors affecting corporate insolvency prediction based on explainable artificial intelligence”, Journal of Digital Contents Society, Vol. 24 No. 9, pp. 2093-2105, doi: 10.9728/dcs.2023.24.9.2093.

Ribeiro, M.T., Singh, S. and Guestrin, C. (2016), “Why should I trust you?’ explaining the predictions of any classifier”, Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135-1144, doi: 10.1145/2939672.2939778.

Ryu, C.H., Choi, S. and Kwon, K.B. (2024), “Current status of corporate debt in Korea and implications”, The Bank of Korea Financial Markets Department, BOK Issue Note, pp. 2024-2112.

Shin, K.S., Lee, T.S. and Kim, H.j. (2005), “An application of support vector machines in bankruptcy prediction model”, Expert Systems with Applications, Vol. 28 No. 1, pp. 127-135, doi: 10.1016/j.eswa.2004.08.009.

Song, D.B. (2020a), Analysis of Determinants of Laggard Firms, Korea Institute for Industrial Economics & Trade, Research Material, pp. 2020-2024, available at: https://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE11115964

Song, S.Y. (2020b), “The impact of marginal firms on labor productivity in manufacturing industry”, The Bank of Korea, BOK Issue Note, pp. 2020-2027.

Song, D.B., Cho, J.H., Kim, H.H. and Kim, I.C. (2021a), “Analysis of domestic marginal firms' decision factors and implications”, Korea Institute for Industrial Economics and Trade, Issue Paper 2021-03, available at: https://kdevelopedia.org/asset/99202104220160590/1619050355444.pdf

Song, H.J., Park, D.J. and Lee, Z.K. (2021b), “An empirical comparison of bankruptcy prediction of external auditing and non-external auditing companies using machine learning methods”, Journal of The Korea Society of Information Technology Policy and Management, Vol. 13 No. 3, pp. 2521-2527.

The Bank of Korea (2015), “Financial stability report”, available at: https://www.bok.or.kr/portal/bbs/P0000593/view.do?nttId=214939&menuNo=200068

Vochozka, M., Vrbka, J. and Suler, P. (2020), “Bankruptcy or success? The effective prediction of a company's financial development using LSTM”, Sustainability, Vol. 12 No. 18, pp. 1-17, doi: 10.3390/su12187529.

Yoon, H. (2019), “The study on the prediction of insolvency of Korean sports industry using machine learning”, The Korean Journal of Physical Education, Vol. 58 No. 6, pp. 165-176, doi: 10.23949/kjpe.2019.07.58.6.14.

Zhou, L. (2013), “Performance of corporate bankruptcy prediction models on imbalanced dataset: the effect of sampling methods”, Knowledge-Based Systems, Vol. 41, pp. 16-25, doi: 10.1016/j.knosys.2012.12.007.

Acknowledgements

The authors would like to acknowledge the financial support provided by the Kongju National University. This work was supported by the research grant of Kongju National University in 2024, the Dongguk University Research Fund and the Soonchunhyang University Research Fund.

Corresponding author

Seongjae Mun can be contacted at: forbelld@sch.ac.kr

Related articles