The implication of machine learning for financial solvency prediction: an empirical analysis on public listed companies of Bangladesh

Purpose – Financial health of a corporation is a great concern for every investor level and decision-makers. For many years, financial solvency prediction is a significant issue throughout academia, precisely in finance. Thisrequirementleadsthisstudytocheckwhethermachinelearningcanbeimplementedinfinancialsolvencyprediction. Design/methodology/approach – This study analyzed 244 Dhaka stock exchange public-listed companies over the 2015 – 2019 period, and two subsets of data are also developed as training and testing datasets. For machinelearningmodelbuilding,samplesareclassifiedassecure,healthyandinsolventbytheAltman Z -score. R statistical software is used to make predictive models of five classifiers and all model performances are measured with different performance metrics such as logarithmic loss (logLoss), area under the curve (AUC), precision recall AUC (prAUC), accuracy, kappa, sensitivity and specificity. Findings – This study found that the artificial neural network classifier has 88% accuracy and sensitivity rate; also, AUC for this model is 96%. However, the ensemble classifier outperforms all other models by considering logLoss and other metrics. Research limitations/implications – The major result of this study can be implicated to the financial institution for credit scoring, credit rating and loan classification, etc. And other companies can implement machine learning models to their enterprise resource planning software to trace their financial solvency. Practical implications – Finally, a predictive application is developed through training a model with 1,200 observations and making it available for all rational and novice investors (Abdullah, 2020). Originality/value – This study found that, with the best of author expertise, the author did not find any studies regarding machine learning research of financial solvency that examines a comparable number of a dataset, with all these models in Bangladesh.


Introduction
Nowadays, the probability and number of insolvent firms have been significantly increasing. As a result, financial solvency prediction is a significant concern throughout academia, primarily in finance. Subsequently, a firm's solvency is extremely important for investors, creditors, stockholders, insurance policyholders, tender suppliers, investment managers, financiers, governments and capital market investors, etc. Thus, a well-developed guideline to effectively assess the probability of financial insolvency is highly desired. Over the last several decades, two models have been developed for predicting corporate failure using discriminant analysis and logistic regression (Altman, 1968;Ohlson, 1980). The forecast of financial distress is a vital and interesting agenda that has been served as the inspiration for numerous academic studies over the last five decades. Due to the high impact of corporate failure events, researchers in artificial intelligence urgently apply intelligent methods to forecast bankruptcy.
Machine learning algorithms evaluate models by training and predicting the output of using previous experience. In this study, machine learning models are trained to distinguish solvent and insolvent firms based on different characteristics of the business including, growth, profitability, leverage, market value, liquidity and growth measures. The corporation becomes insolvent due to different quantifiable and nonquantifiable factors. Using quantifiable factors, many researchers have been trying to predict financial distress for many decades. From the previous literature, the author found that ratio analysis is a significant predictor of financial solvency (Altman, 1968;Beaver, 1966).
To the knowledge of the author, after analyzing the previous literature, there is no application in Bangladesh regarding machine learning for financial solvency prediction. Bangladesh has shifted from an agriculture-based economy to an industrialized economy. In addition, the country's irrational investors are becoming rational with the sound knowledge of investment. Using this background, machine learning algorithms are applied to real-world problems, for example, (1) lender organizations that can leverage their investment decisions, (2) policymakers that could better recognize and examine strategic investment decisions, (3) individual-level investor wants to buy insurance policy form a solvent company and (4) advance payment of goods for tender in a solvent company; based on the outcomes of machine learning models all of these decision-makers will be able to make a decision confidently. Therefore, we have to construct a systematic solution for financial solvency prediction. This research aims to "predict financial solvency by using various machine learning models and linking them together to compare their accuracy and develop a simple model for rational investment decision-makers for Bangladesh".
From the trained machine learning models, these models will be easily replicated for future prediction by inputting their raw data, not only by scholars but also by finance professionals and rational investors. This study is a modest attempt to implement machine learning for predicting financial solvency, and it also develops a simple approach to make it more userfriendly for all levels of rational investment decision-makers in Bangladesh. The objective of this study is to develop machine learning models to predict the financial solvency of businesses for rational investment decision-makers. The evaluation of different algorithms makes it available to the end user and can be considered as the main contributions of this study. The outcome of this research will be beneficial to the industry practitioners as well as rational investors in their decision-making process 2. Theoretical background 2.1 Machine learning and financial solvency prediction The early studies regarding solvency prediction were conducted within different streams of research. Those studies concentrated on a specific set of ratios and occasionally contrasted JABES ratios of failed corporations with those of successful corporations. Altman (1968), Beaver (1966) and Ohlson (1980) have developed the foundation of bankruptcy prediction models, which have been used for prospective financial solvency prediction and model development. Machine learning enables processers to find gist data from big data automatically (Michie et al., 1994;Shapiro, 2001). A recent bibliometric analysis of machine learning models for intelligent bankruptcy prediction methods for corporate firms finds that the first machine learning models for predicting bankruptcy were placed in 1991 according to Web of Science. Though the number of publications gradually increased in 2008, considerable growth in publication counts can be seen from 2008 to 2009. This emphasizes the growth of this topic from the financial crisis in 2007-2008 period, and it shows the great demand for machine learning techniques and their applications in financial distress prediction (Shi and Li, 2009). The author also found that cooperation between scholars is weak, particularly at the global level. As a result, the majority of publications are from the USA, China, Taiwan, Spain and South Korea. The concept of machine learning to predict financial solvency has been previously studied by Tam using neural network models (Tam, 1991). Nevertheless, Jabeur et al. (2020) found that machine learning models can be used more effectively for bond rating than conventional methods by analyzing classifiers (Jabeur et al., 2020). Table 1 demonstrates a list of different studies that emphasized bankruptcy prediction, and they are summarized by each study based on the sample size, time, models used and their significant findings.
2.2 Financial solvency prediction and Altman Z Score Altman (1968) developed a multivariate discriminant analysis model for financial distress prediction also known as the Z-Score model. After four decades with many arguments and adjustments, recent models are demonstrated as in models 1 and 2, while model 1 is for nonmanufacturing firms and model 2 is used for other companies. x 1 5 working capital/total assets, x 2 5 retained earnings/total assets, x 3 5 earnings before interest and taxes/total assets, x 4 5 market value of equity/total liabilities, x 5 5 total sales/total assets. Z ¼ 6:56x 1 þ 3:26x 2 þ 6:72x 3 þ 1:05x 4 (1) Z 0 ¼ 1:26x 1 þ 1:4x 2 þ 3:3x 3 þ 0:6x 4 þ 1:0x 5 The function used in the Z-score is demonstrated in equations (1) and (2), where equation (2) is for the value of Z-Score companies that can be categorized into a different category as the red zone, safe zone, and gray zone (Altman, 1968). In this study, the Altman Z-score will be used for company classification.

Data collection and methodology
This study is based on the context of Bangladesh. There are 323 companies in the Dhaka Stock Exchange whose stocks are traded daily. This study tries to cover all public listed companies in the Dhaka stock exchange in the dataset. However, a few companies' information is not available. As a result, 244 companies have been selected as samples. This study used big data mining techniques to analyze a large dataset of annual reports of Bangladeshi companies through mining data set from 2015 to 2019 from the annual report and Dhaka Stock Exchange reports. For data analysis, all 244 companies' data in the 2015-2019 period are classified into three categories by implementing the Altman Z-score. Equation (1) is applied for manufacturingrelated business such as textile, pharmaceuticals, engineering, tannery, paper and printing, food, ceramics sector, etc. Then, equation (2) is used for service industries such as banks, financial institutions, insurance, telecommunication, etc. Classification criteria from Table 1 are used for classification (Altman, 1968 (1) if a company is "secure" then there is low risk; (2) the company is solvent if it is categorized as "healthy" which might head to insolvency and (3) if a company is categorized as "insolvent", it denotes that the company's financial condition is in the distress zone. After categorizing the company's solvency, two subsets of data covering the 2016-2017 period are extracted to develop the training sample. Another subset of data is then created as testing data set for the 2018-2019 period. R Statistical Computing Software is used for data mining, data analysis and machine learning model training. Five machine learning algorithms are applied for the training and testing of models; they are demonstrated in Table 3. For the development of models, 11 features have been extracted, and Table 4 presents a demonstration of the features. X1 to X5 is used for the discriminant zone classification and six more features have been extracted for creating features in machine learning models. Previous studies indicate that all of these features have significant relationships and these features are also called predictor variables, and they can explain financial solvency (Barboza et al., 2017;Mai et al., 2019;Son et al., 2019;Kim et al., 2016). There are many performance indicators for machine learning models and hyperparameter tuning methods. In this study prediction performance will be measured by logarithmic loss (logLoss), area under the curve (AUC), precision recall AUC (prAUC), accuracy, kappa, sensitivity and specificity.
In equation (3) NTP is the number of true positive classification, that is insolvent firms classified correctly. And NTN is the number of true negative classification, that is secure firms classified correctly. In addition, in equation (4), NFN is the number of false-positive classification, which insolvent firms classified incorrectly, while NFP is the number of false positive classification, which denotes secure firms classified incorrectly. The value of sensitivity and specificity will be close to 1 if there is low classification error. After the model training and prediction, all performance metrics will be analyzed and compared between all models' results.

Results and discussion
To examine the prediction of solvency by machine learning models, it is necessary to classify the samples into three classes, which are defined in   Table 1.

Financial solvency prediction
Z-score value, classes are defined as per the discriminant zones. The industry-wide data over the 2015-2019 period are demonstrated in Figure 1. There are a smaller number of "secure" companies out of the sample over the given period. Also, nonbank financial institutions are facing major insolvency problems as most of them are categorized as "insolvent". Most interestingly, almost every insurance company performs well. Also, the textile industry is in a good position, but its performance worsens yearly. A previously conducted study found that only Sharia-based banks are performing better than conventional banks, and the result are consistent with our findings (Abdullah, 2015).
The next step is descriptive statistics. In Table 5, descriptive statistics of three datasets are represented, and all values are rounded to two decimals. The data are collected from 244 companies of 18 different industries. This study measures the performance of different models for solvency prediction; no data transformation and no normalization were conducted (Barboza et al., 2017). The current ratio (CR) has highly deviated, and X4 is highly deviated, which indicates that the market values of companies are much higher than their liabilities. The skewness and kurtosis indicate that data are highly skewed. As industry-wise companies' market value, profitability and company size are distinct; this will be a better set for training and testing models because this output is a better prediction for all companies. Also, this full dataset can be used for future model training and prediction for all Bangladeshi companies.

JABES
It is vital to check the correlation between all variables of every dataset before moving to predictive machine learning model training. Pearson correlation has been analyzed to check the correlation between all predictor variables. The correlation matrix plot of all datasets is plotted in the heatmap of Figure 2, and results are presented in Table 6. From the full dataset correlation, results indicate almost all correlations are significant except the CR. The CR has a significant correlation with current liabilities/total liabilities (CLTL) and X1 only. Moreover, the results of the training dataset correlation analysis are presented in Table 6. The results indicate that X1 has a significant correlation with most of the variables except X2, X5, NIS and ROE. However, X2 has a significant correlation with most of the variables. Results are similar to the full dataset and only the CR has less correlation. Therefore, the training sample is consistent with full data. Finally, the results of the correlation analysis of the testing dataset are demonstrated in Table 6. The results of the analysis indicate the same results of the full dataset and training dataset. Most of all variable's correlations are significant. In summary, most of the predictor variables are significantly correlated, and the dataset can be used for building machine learning models.

Artificial neural network classifier results
For ANNC training, averaged neural network base learner is used for financial solvency prediction. In this research, ten-fold classification and five neurons are used for model training. The confusion matrix of this trained model is demonstrated in Figure 3. In the Note(s): ** Correlation is significant at the 0.01 level and * Correlation is significant at the 0.05 level Table 6. Correlations between all predictor variables of all datasets JABES middle of each tile, the normalized count of classes is plotted and the count column percentages are given below. Within the testing dataset, 89.3% was predicted to be secure class, 0.6% insolvent and 10% healthy. At the right portion of each tile, the row percentage of ANNC SVM Figure 3.

Confusion matrix of ANNC and SVM
Financial solvency prediction all observations is plotted, 91% was "secure", while 0.7% was "insolvent" and 8.2% "healthy". The overall results of the confusion matrix of the ANNC model have significant predictive power. Nevertheless, the results of model training and testing are demonstrated in Table 7 as ANNC. The accuracy of this model at the stage of training and testing are 94 and 82%, respectively. In addition, the AUC of training and testing of models are 96 and 94%, respectively. The sensitivity and specificity of training are 84 and 92%, also testing stage results are 88 and 94%. Lastly, prAUC, logloss and kappa have significant results in both training and testing models, and previous studies support the results (Raghupathi et al., 1991;Geng et al., 2015;Barboza et al., 2017;Hosaka, 2019).

Support vector machine results
In the process of support vector machines model training, a support vector machine with a linear kernel is used as a base learner, and the results summary is shown in Table 7. The accuracy and area under the curve of the SVM model are 94 and 84% for model training, 96 and 88% for model testing. However, the value of sensitivity and specificity for model training is 84 and 92%, 88%, and 94%, respectively for model testing. The confusion matrix of the SVM model is demonstrated in Figure 3; at the bottom of every tile the column percentage is shown, where, for example, 84.9% is predicted as the proportion classified as secure. The right side of every column indicates the value of row percentage. The previous studies and these results indicate that support vector machines can predict financial solvency significantly (Chandra et al., 2009;Cecchini et al., 2010;Kim and Kang, 2010;Xiao et al., 2016).

Naive Bayes classifier results
For training, the naive Bayes classifier and naive Bayes base learner are used with ten-fold validation. The performance metrics are presented in Table 7. The accuracy of NBC models for training and testing is 60 and 74%, ACU's percentage is 86 and 89% for training and testing of the model. Also, the sensitivity and specificity for model training are 55 and 79%, 85%, and 92% for model testing. The confusion matrix of the NBC model is plotted in Figure 4. All these results indicate that the naive Bayes classifier can predict the financial solvency of corporations and the previous literature also supports the results (Sun and Shenoy, 2007;Tavana et al., 2013;Masmoudi et al., 2019;Barboza et al., 2017).
4.4 K-nearest neighbor classifier results K-nearest neighbors is used as a base learner with a ten-fold classification for model training. The confusion matrix of KNNC is plotted in Figure 4, and results of training and testing

Ensemble classifier results
The final model of this study is Ensemble Classifier, Bagged classification, and regression trees (CART) which is used as a base learner with a ten-fold classification for this model. The confusion matrix of EC is plotted in Figure 5. The results of training and testing of EC models are presented in Table 7, also the confusion matrix is presented in Figure 5. The sensitivity and specificity for model training are 89 and 95% and for testing are 86 and 93%. AUC and accuracy for the training model are 98 and 89% and for testing models are 96 and 86%. Several previous studies also found these consistent results (Heo, 2014;Kim and Kang, 2010;Pisula, 2020). Consequently, this indicates that the ensemble classifier can be used for financial solvency prediction.
In Table 7 all models' performance metrics are presented. logLoss is the measurement for the performance of a machine learning classification model where the prediction class can be a binary class or multiclass. The main goal of a machine learning model is to minimize this logLoss value. When the logLoss is near 0 then the model can be classified as a perfect model. AUC represents the results of the area under the receiver operating characteristic (ROC) curve. The graphical presentation of a classifier with its discriminant threshold value is called the ROC curve; in Figure 6 all models' ROC curves are illustrated. The higher the value of AUC, the higher the predictive power of the model will be. Nevertheless, prAUC calculates the percentage of predicted positive class.
Cohen's kappa statistics are a better indicator for multi-class predictive modeling, which indicates the degree of model fitness of good. According to Landis and Koch (1977) value of kappa can be categorized in a different segment, where the value is less than 0%. It indicates that the model is not a good fit, while 0%-20% is a low and 21%-40% is a fair significance. In addition, 41%-60% indicates moderate significance, while 61%-80% indicates considerable significance, and 81%-100% indicates perfect significance. Sensitivity is the measurement of  Geng et al. (2015) also found significant results by using ANNs. The result of kappa is lower for the NBC which indicates that NBC has moderate significance in the prediction of financial solvency. Nevertheless, the logLoss value EC outperformed all other models as its value is close to zero. Many researchers also found consistent results by applying EC results (Heo, 2014;Kim and Kang, 2010;Pisula, 2020). Also, Figure 6 of ROC curves indicates the nearest neighbor classifier has a lower AUC. However, ANNCs revealed better outcomes for financial solvency prediction.

Implication and conclusions
Financial solvency prediction is linked with credit and liquidity risk, which were derived from the financial crisis and recent financial scams. Every investor wants to know where they are investing and what is their financial condition. However, there was a gap in the literature regarding machine learning for financial solvency prediction in the context of Bangladesh. The study aims to fill this gap by developing machine learning models in the context of Bangladesh for financial solvency prediction. For this reason, this study analyses 244 Dhaka stock exchanges; publicly listed companies have been selected, and data have been collected from the 2015 to 2019 period. And two subsets of data have also been developed for training and testing datasets. The uniqueness of this study comes from the large dataset in the context of Bangladesh and considering time series in data. For machine learning model training, it is necessary to classify observation in different categorized classes, so data are classified as Financial solvency prediction secure, healthy and insolvent by the analysis of the Altman Z-score. The next step of machine learning model building is feature selection. By reviewing the previous literature, a total of 11 features have been selected for model training. After selecting the model training and testing, all model performances are measured. This study found that the ANNC is the best model for financial solvency prediction among all model, EC also has consistent results (Heo, 2014;Kim and Kang, 2010;Pisula, 2020;Ioannidis et al., 2010;Kim and Kang, 2010;Yeh et al., 2010;Olson et al., 2012).
The findings of this study can be used by academicians, investors and financial practitioners. The implication of this study is to develop a systematic way to implement the machine learning model in the enterprise resource planning software of companies. By doing so, decision-makers can assess their financial position. Bank, nonbank financial institutions and microcredit organizations can implement machine learning models for client categorization. Financial intermediaries can train predictive models with their client data and use that for loan sanctioning which will reduce the cost of loan sanctioning as well as credit card issuance. Finally, this study aims to make a guideline for the rational investor to decide before making any investment decision. This objective led the author to train the model with the full dataset of 1,200 observations and make the models available for future prediction in the context of Bangladesh. This predictive model is compiled through the R statistical programing language and named "Financial Solvency Prediction by Machine Learning Web App" and made available for the rational investor (Abdullah, 2020). By inputting all features, it will show the prediction results with their probabilities. This application can be used by all level users, from decision-makers to rational investors.
The major drawback is not considering external effects such as GDP, inflation rates, foreign exchange volatility, etc. along with the internal factors such as board composition, top management compensation, etc. Another drawback of this study is to only consider 11 features, while a larger number of features can improve the predictive power of the models. The outcome of this study should be implemented by every financial institution as machine learning can be used for credit scoring, credit rating, loan classification, etc. As prior experience is needed for building a machine learning predictive model, an individual-level investor can use the application developed by the author for investment decision-making including creating term deposits into the financial institution, purchasing a share or buying an insurance policy, etc. For experts, the results of this study are interesting as data transformations are not done but the prediction power was highly significant for the models. Future research can eliminate the limitations of this study and can consider external variables or add more features and tests to predict the credit rating, small and medium enterprise loan default classification, credit card default classification, corporate loan default prediction, etc.