Prediction of financial distress in the Spanish banking system An application using artificial neural networks

Purpose – The purpose of this study is to construct the first short-term financial distress prediction model for the Spanish banking sector. Design/methodology/approach – The concept of financial distress covers a range of different types of financial problems, in addition to bankruptcy, which is not common in the sector. The methodology used to predict financial problems was artificial neural networks using traditional financial variables according to the capital, assets, management, earnings, liquidity and sensibility system, as well as a series of macroeconomic variables, the impact of which has been proven in a number of studies. Findings – The results obtained show that artificial neural networks are a highly suitable method for studying financial distress in Spanish credit institutions and for predicting all cases in which an entity has short-term financial problems. Originality/value – This is the first work that tries to build a model of artificial neural networks to predict the financial distress in the Spanish banking system, grouping under the concept of financial distress, apart from bankruptcy, other financial problems that affect the viability of these entities.


Introduction
The financial crisis that began in the summer of 2007 with the bursting of the property market bubble had multiple consequences on the global economy, showing, among other issues, that the financial problems of credit institutions is a social and economic problem that affects companies around the world (Halteh et al., 2018).
In the study of the financial problems suffered by these entities, commonly known as financial distress, the capacity to predict and anticipate the consequences is therefore essential. Detecting the early signs of financial distress constitutes a key area of research for corporate finance, in which the core function is predicting financial problems (Sun et al., 2013;Inam et al., 2018).
The term financial distress has been used for some time to describe different financial problems that affect companies. The initial studies carried out on financial distress (Beaver, 1966;Altman, 1968;Deakin, 1972) coincide on with the fact that financial difficulties include the inability to pay debts or preferential dividends and the resulting consequences, overdrawn bank accounts, liquidation for to pay interests of creditors and, even, legal bankruptcy proceedings. Carmichael (1972) defined it as a situation in which a company is unable to meet its obligations. This includes situations of insufficient liquidity, insufficient capital, failure to pay debts and insufficient liquid capital. Foster (1986) defined the term as a serious liquidity problem that cannot be resolved without a large-scale restructuring of operations or of the business entity.
However, over the years, the concept of financial distress has been grouping more features. Doumpos and Zopounidis (1999) go beyond these traditional perspectives and include the negative net present value of assets in their definition of financial distress. Bose (2006) considered that a company is in financial distress when the listed value of its assets is less than 10 cents in the dollar. Hua et al. (2007) claimed that financial failure occurs when a company suffers chronic or serious problems or when it becomes insolvent with liabilities that are disproportionate to its assets. Lin (2009) considers that a company is in situation of financial distress in any of the following situations: bankruptcy, failure to pay debentures, overdrawn deposits, a significant event that does not allow debts to be paid upon maturity, entry into insolvency proceedings or when the listed price of shares falls below a specific minimum. Geng et al. (2015) defined financial distress as the situation in which the operating cash flow of a company cannot replace negative net assets.
With respect to the definition of financial distress in the scope of the subject matter studied, Betz et al. (2014) claimed that credit institution financial distress included bankruptcy, liquidation and failure to meet obligations. They also considered that financial distress exists when an injection of capital is required by the government, asset bailout situations and forced mergers. This definition is also is followed by Constantin et al. (2018). According to these studies, financial distress can be defined as a situation in which a company has solvency problems at different levels that prevent it from performing its business without external aid and reduce its value until it reaches bankruptcy and therefore has to exit the market. This is the concept of financial distress on which we base our study, which requires an analysis of credit institutions' present and future financial problems. It should be mentioned that the banking system has special features such as strict government control. Unlike in other sectors, this situation indicates that the government often has to intervene to avoid the failure of a bank, especially when it is very large (too big to fail), which explains the limited number of entities that have actually failed. A broader study than just the concept of bankruptcy is therefore required to measure their "state of health." Regarding the methodologies used to predict the financial distress, there have been numerous classification techniques used to predict financial distress. The previous studies on corporate financial problems used the descriptive methods (Fitzpatrick, 1932;Smith and Winakor, 1935;Merwin,1942) and classified the companies analyzed into two groups (healthy and failing) using financial ratios. Half way through the 1960s, the predictive methods began to appear, with Beaver (1966) who performed a univariant data analysis to predict credit risk, suggesting threshold values as financial relationship variables in terms of profitability, liquidity and solvency to classify companies into healthy and failing. Altman (1968) using a multivariate discriminant analysis in his famous Z-score model AEA 28,82 showed that the model had a significantly higher capacity of prediction in the year before bankruptcy than the univariate discriminant analysis models. Deakin (1972) also applied multivariate discriminant analysis using the ratios of Beaver (1966) and confirming this methodology is suitable for predicting business failure up to three years in advance. Ohlson (1980) proposed applying the logit model for predicting financial distress because a company was experiencing financial difficulties according to whether its logit output is below or above the cut-off probability point chosen a priori. Another model typically used to predict financial distress is the probit model used by Zmijewski (1984).
In the 1990s, with the development of information sciences, artificial intelligence models became popular for predicting financial problems as they were the most popular method artificial neural networks. Bell et al. (1990) were the first to apply an artificial intelligence method to predict problems in the banking sector with their comparative study of neural networks and statistical models to demonstrate the superiority of artificial neural networks. Odom and Sharda (1990) developed a neural network model to predict bankruptcy, which they compared to multivariate discriminant analysis and showed the superiority of the neural network. In their study, Coats and Fant (1992) concluded that the neural network approach not only offered a high degree of prediction accuracy but also exceeded the limitations of the multivariate discriminant analysis and improved the results. Fletcher and Goss (1993) compared the predictive capacity of the financial distress with artificial neural networks and the logit model and found more accurate prediction with artificial neural networks. Serrano and Martín (1993) analyzed the possibility of the bankruptcy of Spanish banks, based on the work of Laffarga et al. (1985) and Pina (1989). They proved that, with the same information, neural models were more accurate than classic models and, along with their greater simplicity in interpreting conclusions compared to multivariant statistical analysis, were suitable for decision-making. Wilson and Sharda (1994), who performed a comparison with multivariate discriminant analysis, determined that neural networks performed significantly better than the multivariate discriminant analysis. Recently, Geng et al. (2015) used artificial neural networks, decision trees and vector support machines to predict financial distress in the banking system. These authors show how artificial neural networks presented a more accurate performance than the other classifiers. Slavici et al. (2016) used artificial neural networks to project the financial distress in eastern European companies by claiming that artificial neural networks are more productive for predicting bankruptcy and more accurate than traditional methods. Inam et al. (2018) compared multivariate discriminant analysis, logarithmic regression and artificial neural networks for bankruptcy prediction by demonstrating how artificial neural networks were more appropriate than predictive techniques. Lahmiri and Bekiros (2019) used four models of artificial neural networks to predict business bankruptcy by demonstrating that neural networks are a robust and adequate methodology for predicting financial problems.
The variables used in the majority of studies to predict financial distress have been financial ratios, especially the ratios classified in the capital, assets, management, earnings and liquidity (CAMEL) or capital, assets, management, earnings, liquidity and sensibility) (CAMELS) system (Thomson, 1991;Cole and Gunther, 1998;Kumar and Ravi, 2007;Poghosyan and Cihak, 2009;Roman and S argu, 2013;Betz et al., 2014;Rosa and Gartner, 2018;Constantin et al., 2018). However, an increasing number include additional variables Prediction of financial distress that may have a significant influence on situations of corporate stress (González-Hermosillo, 1999;Curry et al., 2007). Based on these trends, in addition to using traditional CAMELS explanatory variables, we incorporated macroeconomic variables because of their impact on a credit institution's financial problems (González-Hermosillo, 1999;Curry et al., 2007). Thus, we built a model of artificial neural networks to predict the financial distress in the Spanish banking system as it was the first model of neural networks that was built for this country. Analyzing a total of 148 credit institutions during the 2012-2016 period, we determined that the proposed model manages to predict all cases in which an entity has short-term financial problems.
This paper is structured as follows. Section II explains the methodology chosen to predict financial distress in the Spanish banking system. Section III describes the data and variables used. Section IV presents and explains the results obtained. Finally, Section 5 provides the conclusions.

Methodology
Because of their ability to learn from and adapt to a set of data, capture non-linear relations between variables and the absence of the need to know functional forms a priori (Wilson and Sharda, 1994;Chen et al., 2009) and the satisfactory results obtained in predicting financial distress in different studies, the methodology chosen for this work was that of artificial neural networks.
Artificial neural networks have features that are similar to those of the human brain such as learning from experience, the generalization of past events in relation to new events and the capacity of abstraction of the main characteristics of a series of data.
There are several types within the concept of artificial neural networks, the most commonly used called "Multilayer Perceptron" networks, which use a back-propagation learning rule. This type of network was used to predict financial distress in the Spanish banking system.
Neurons are composed, in general terms, by the soma, which is where the cell nucleus is, and by the axon, with which some neurons connect with each other through the dendrites, producing the synapse. "Artificial neurons" try to replicate this neuronal biological function.
Each neuron has a certain numerical value called a value or activation state a (i) . This value or activation state a (t) is transformed through an output function, f i , into an output signal, y i (axon). The output signal is sent to other neurons in the network and changes according to the associated weighting, w ji , resulting from the intensity of interaction between the neurons (synapsis) according to a certain rule (dendrites). The modified signals that reach each neuron combine to generate the total input, Net i . This total input is processed by an activation function F, thus obtaining a new activation state a iþ1 ð Þ . These "artificial neurons" are organized into layers within the neural network. There are three types of layers: (1) Input layer: this layer houses the neurons that receive information from the outside, x j (initial variables).
(2) Hidden layers: hidden layers are in charge of relating neurons from the input layers to the output layer neurons. (3) Output layer: the output layer contains neurons whose output represents the prediction.
Given that the neural network used was the Multilayer Perceptron network, the information always feeds forward and learning is supervised by back-propagation. The generic neural network is shown in Figure 1.

AEA 28,82
The application process is divided into three stages: the functioning stage, the learning stage and the validity stage: Functioning stage: In this stage, we find the input vector X ¼ x 1 ; x 2 ; . . . ; x n C ð Þ and the desired output vector Y ¼ y 1 : y 2 ; . . . ; y n C ð Þ and weighting is introduced to obtain the output of the different neurons: 1. Output of neurons from the input layer: 2. Output of neurons from the hidden layers: which gives the total input: where a CÀt j are the outputs of the neurons in layer C À t, w CÀt ji are the weightings of the connections of the neurons from C À t with those of layer C À t þ 1 and u CÀtþ1 i are the thresholds of the neurons of layer C À t þ 1 that are normally just another connection whose input is a constant of 1. This input is transformed by an activation function, F, and the most common functions are the sigmoidal and hyperbolic tangent functions: Thus, the output of the neurons in hidden layers, a c i , will be as follows: . . . ; n c y c ¼ 2; 3; . . . ; C À 1 (4) 3. Output of the neurons in the output layer: the output of the neurons in the output layer in this case was subject to a different function chosen, the Softmax function, which is suitable when the dependent variable is a categorical variable (as in our study): Learning stage: In this stage, the network is trained to minimize error: where W is the set of weightings and thresholds of the network and E is a function of error that evaluates the differences between the network outputs and the desired outputs. The total cross-entropy error is as follows: where N is the total number of patterns or samples and e n ð Þ is the error committed by the network for pattern n.
The cross-entropy error for each pattern e n ð Þ was obtained as follows: where Y n ð Þ ¼ y 1 n ð Þ; y 2 n ð Þ; . . . ; y n C À Á is the network output vector for pattern n and S n ð Þ ¼ s 1 n ð Þ; s 2 n ð Þ; . . . ; s n C À Á is the desired network output vector for pattern n. m learning cycles or epochs were performed to minimize total error E.
Validation stage When the number of parameters is excessive, the model adjusts too closely to irrelevant particularities and loses its ability to generalize (over-adjustment phenomena). To avoid this problem, we used a second set of data called the validation set, aimed at evaluating the network error after each learning stage and determining the moment at which it begins to increase. Training is therefore stopped when the validation error increases and the previous learning stage parameters are maintained (early stopping). Finally, to measure its capacity to generalize, a third set of data was required, the testing set, which provides an unbiased estimate of the generalization error.

Data and variables
To build a predictive model of financial distress in the Spanish banking system, we used information from banks, savings banks and credit cooperatives during the 2012-2016 period. We limited our study to entities that are classified as credit institutions by the Bank of Spain and did not include the credit institutions on which data was not available. Therefore, the sample used for the study comprised 148 credit institutions. Specifically, we used 59 Banks, 16 Savings Banks and 73 Credit Cooperatives.
When predicting short-term financial distress in the Spanish banking sector, a distinction must be made between the categorical variable, representing a situation of distress (dependent variable), and the variables used to explain financial distress (independent variables).
To determine when an entity was in financial distress, it was first necessary to establish the indicators. In this study, we considered that a bank was in financial distress when it was faced by one of the following situations: Bankruptcy. This is the most serious financial problem that may affect a banking entity and the subject of the majority of independent financial distress studies carried out (Serrano and Martín, 1993;Bongini et al., 2001;Betz et al., 2014;Chiaramonte and Casu, 2017;Constantin et al., 2018;Inam et al., 2018;Lahmiri and Bekiros, 2019). The entity has not met its coupon payment obligations or delayed payment. Failure to pay interest on debts is a clear sign that a company has liquidity problems in meeting its obligations (Angelini et al., 2007;Curry et al., 2007;Betz et al., 2014;Constantin et al., 2018).
The entity requires the intervention of the Deposit Guarantee Fund (DGF). The intervention of the DGF to return the deposits made by the clients of a bank is a clear sign of its inability to meet its commitments with its clients (Laffarga et al., 1985;Pina, 1989). Bell et al. (1990), Thomson (1991) and Cole and Gunther (1998) include insured banks that require funds from the Federal Deposit Insurance Corporation.
The entity or a part of its assets are absorbed by another entity. The fact that an entity or part of its assets has been absorbed by another is an indication that it is not functioning correctly on its own or has serious liquidity problems (Pina, 1989). González-Hermosillo (1999), Bongini et al. (2001) and Chiaramonte and Casu (2017), amongst others, include banks that were absorbed by other bank or banks. Bell et al. (1990) include banks whose deposits are absorbed by others. The entity has merged, with a coverage ratio less than 0. A good measure of whether a bank has merged because of problems is the coverage ratio (González-Hermosillo, 1999), a variable that enables differentiating forced mergers because of financial problems faced by one or more of the entities taking part and mergers that take place for other reasons. According to this variable, a financial entity is in financial distress if its coverage ratio has been less than 0 in the year prior to the merger, with the coverage ratio being represented by the proportion between loan capital and reserves minus impaired loans and total assets (Betz et al., 2014;Constantin et al., 2018). Other researchers that include mergers are Bell et al. (1990), Bongini et al. (2001), Curry et al. (2007) and Chiaramonte and Casu (2017).
The entity has received different forms of public aid. Public aid for restructuring (primarily through "Fondo de Reestructuraci on Bancaria") or the bailing out of an entity is an obvious indication of financial problems and the latent inability to Prediction of financial distress independently operate (Bell et al., 1990;Bongini et al., 2001;Betz et al., 2014;Constantin et al., 2018;Chiaramonte and Casu, 2017).
Therefore, financial distress is considered as the financial problems faced by an entity that prevents it from independently meeting its obligations, thus resulting in the requirement for external aid to be able to continue operating either by means of a merger, acquisition, intervention by a consumer protection authority or public aid, with the most serious case of financial distress being bankruptcy.
To obtain the necessary data to determine the different situations of financial distress mentioned above, it was necessary to use the sources indicated in Table I.
The independent or explanatory variables chosen for the model were a series of financial ratios classified according to the CAMELS framework, the parameters of which were used to evaluate a financial solvency (Roman and S argu, 2013). The variables classified by the CAMEL or CAMELS system have been used by a number of researchers to study financial problems (Thomson, 1991;Cole and Gunther, 1998;Kumar and Ravi, 2007;Poghosyan and Cihak, 2009;Roman and S argu, 2013;Betz et al., 2014 andConstantin et al., 2018). However, in addition to these variables, we introduced several macroeconomic variables into the model because of their proven impact on banking entities' financial problems (González-Hermosillo, 1999;Curry et al., 2007). We thus obtained 52 independent variables to explain financial distress (Table II).
Given that the objective of this study is to predict short-term financial distress, the prediction model is constructed by selecting the explanatory variables (CAMELS variables and macroeconomic variables) on December 31st of the previous year to that in which the entity was in a situation of financial distress. In this manner, the model will allow you to predict whether an entity will be in financial distress over the next 12 months from the financial information and the macroeconomic situation. However, not all entities used in the study have been through some distress situations. For entities that have not been in the financial distress state, the explanatory variables of the last year in which financial information has been available have been used.
Because some entities have had more than one situation of financial distress, the sample used has occupied a total of 151 observations, of which 32 show a situation of financial distress.

Results
Once we had determined the data and working sample, we proceeded to apply the "Multilayer Perceptron" to obtain a network that could predict Spanish credit institution financial distress in the short term.
Because lost values were found in certain independent variables, we replaced them with the average, as the expected value of the variable.

AEA 28,82
According to the proposed model, to be able to obtain an artificial neural network with a high prediction capacity that is not subject to the inconvenience of over-adjustment, the sample was divided into three sub-samples: (1) Training: a sub-sample in which the weightings and thresholds were established to reduce overall error. (2) Testing: a sub-sample to monitor the errors committed during training to avoid excess training. (3) Reserve: a sub-sample used to evaluate the network's capacity to generalize.
We decided to assign 75 per cent of the total sample to the training phase, 15 per cent to testing and 10 per cent to the reserve, thus maintaining the approximate proportion between healthy situations and those of financial distress (Table III).
The activation function used was the hyperbolic tangent for hidden layers and the softmax function for the output layer. For the architecture, we used automatic selection because it adjusts the network better in general.
Batch training was selected as the most suitable type of training for small samples because it minimizes total error. Note that the optimization algorithm used to be the slope of the gradient, in which the training rate was 0.4 and the impulse was 0.9.
The neural network obtained had a single hidden layer and the network architecture was 53 Â 5 Â 2 (53 input variables, 5 nodes in the hidden layer and 2 output variables).
With regard to the predictions obtained using the network, Table IV shows the results of each sub-sample. The cross-entropy error in the training sub-sample is 4.850, with an incorrect prediction Prediction of financial distress percentage of 1.8 per cent. In the testing sub-sample, the cross-entropy error was 1.921 and the percentage of incorrect predictions was 4.3 per cent. Finally, the reserve sub-sample showed a percentage of incorrect predictions of 6.3 per cent. Therefore, it can be observed that the network prediction accuracy was high and that the reserve sub-sample had the lowest percentage of correct predictions (93.7 per cent). It can also be observed that the entropy error in building the network was reduced for both the training and testing sub-samples, which shows that it is not over-adjusted.
Breaking down the results of each sub-sample, Table V shows that in the training subsample, 100 per cent of financial distress situations were correctly predicted, with a success rate of 97.8 per cent in the classification of healthy companies, indicating that it can only be mistaken in 2.2 per cent of cases, which is the probability of a company without financial problems being classified as in financial distress (type II error). In the testing sub-sample, as in training, 100 per cent of financial distress situations were correctly predicted, with a type II error of 5.6 per cent. Finally, the reserve sub-sample or model validating sample correctly predicted 100 per cent of financial distress situations, with a success rate of 91.7 per cent for predicting that a bank does not have problems.
It can therefore be concluded that the model's probability of overall success is 97.3 per cent.
To analyze the sensitivity and specificity of the network (probability of correctly classifying a positive case and negative case, respectively), we used receiver operating characteristic (ROC) curves based on the pseudo probability obtained from the network. This tool evaluates the efficiency of the classification of a dependent variable by contrasting it in each dependent variable category: where: As there are only two categories, the curves are symmetric. Figure 2 shows the ROC curves in the upper left corner, indicating that there are clear differences between the credit institutions in financial distress in the short term and those that are not such a situation.
The probability of correctly classifying credit institutions in financial distress and those that are not, and vice versa, is more accurately observed by the area under the curve. Table VI shows that there is a 99.7 per cent probability of a correct classification, which is proven by the effectiveness of the network.
It is also useful to know which model variables are the most relevant, with the relevance being determined as the change in the model when such variables are altered. This determines the variables that contribute the most to determining that a credit institution will be in financial distress or not within the next 12 months (Table VII).  The most relevant variable in the neural network, with a percentage of 100 per cent, is the ratio between earnings from interest and average gross loans, followed by that of provisions for losses on impaired loans/average gross loans (83.9 per cent). This result gives an indication of the fundamental role of the earnings and risk coverage in the financial solvency of a credit institution.
In addition, if we look at the macroeconomic variables used, we see the relevance of the price of housing on the distress affecting banks, which highlights the property market bubble as one of the main detonators of financial distress in the Spanish banking sector, closely followed by yield on public debt.
Having demonstrated the high predictive capacity of artificial neural networks to determine the financial distress of a short-term credit institution, it is interesting to compare the forecasting capacity of this methodology with another. In this work, based on previous literature (Odom and Sharda, 1990;Coats and Fant, 1992;Wilson and Sharda, 1994;Inam et al., 2018), it has been suggested to compare artificial neural networks with multivariate discriminant analysis for predicting short-term financial distress in the Spanish banking system.
To apply the multivariate discriminant analysis, 112 observations have been selected for training, the same as that for neural networks. The observations not selected for the contrast were the remaining 39. Table VIII shows the results obtained by applying multivariate discriminant analysis. It is shown how, taking the selected observations, the multivariate discriminant analysis correctly predicts 97.8 per cent of the entities that will not be in the short-term financial Prediction of financial distress distress, and 91.3 per cent of the entities that will enter the financial distress. In this manner, it obtains a global success rate for the selected observations of 96.4 per cent. For nonselected observations, however, the success rate for entities that will not be in financial distress is 90 per cent, whereas for entities that will be in financial distress in the short term, it is 100 per cent. Thus, the overall success rate for non-selected observations is 92.3 per cent.
Comparing the results obtained with the artificial neural networks and the multivariate discriminant analysis, it is observed that the prediction capacity for the selected subsample is greater with the artificial neural networks than with the multivariate discriminant analysis. Similarly, the success rate in the subsample of unselected cases with multivariate discriminant analysis is lower than the success rates obtained with artificial neural networks for the validity and reserve subsample.
Based on the obtained results, artificial neural networks are an effective and robust method for predicting short-term financial distress in credit institutions. This finding is consistent with that obtained by Bell et al. (1990), Odom and Sharda (1990), Coats and Fant (1992), Wilson and Sharda (1994), Rafiei et al. (2011) and Inam et al. (2018).

Conclusions
The prediction of financial distress in its multiple forms has been a key objective in the study of credit institutions throughout the world. In light of the events that have taken place in recent years, the creation of a method that enables the prediction of the consequences of credit institution insolvency is of vital interest.
Focusing on Spain and the entire spectrum of entities that operate as credit institutions, the determining factors of financial distress were defined as bankruptcy, failure to meet financial obligations, the intervention of the DGF, the absorption or acquisition of assets, mergers because of problems and government aid, with the aim of including as many situations as possible in the concept of financial distress.
According to established literature, we used different ratios based on the CAMELS framework as explanatory variables, as well as other macroeconomic variables because of their impact on the macroeconomic situation of such entities.
Using artificial neural networks, specifically the Multilayer Perceptron network with a hidden layer, we obtained a prediction model that is capable of predicting short-term financial distress with an overall accuracy of more than 97 per cent using training, testing and reserve sub-samples. This degree of prediction accuracy is well above the average obtained by other authors in previous studies on the concept of credit institution financial distress.
We therefore consider that this research contributes to financial distress literature by providing the first neural network model applied in Spain to predict financial distress. It should also be pointed out that this study is one of the few that has been carried out with a reserve sub-sample, thus increasing its capacity to generalize and eliminate the problem of over-adjustment, which is so common in this type of model. Furthermore, the only error observed was a type 2 (false positive) error, indicating that in the sample used, there were no cases whatsoever in which an entity in financial distress in a period of 12 months was not correctly predicted.
The reliability of the results was achieved with ROC curves, showing major differences between entities suffering and not suffering from financial stress. The network specifically obtained showing a differentiation capacity of 99.7 per cent, all of which was based on the accounting and macroeconomic data recorded in the 12 months prior to the event.
Moreover, to demonstrate the robustness and adequacy of artificial neural networks compared to other methods, multivariate discriminant analysis has been applied to the data treated in the study, demonstrating how neural networks have a greater prognostic capacity than multivariate discriminant analysis.
Finally, bearing in mind the relevance of the different variables in the model, relations were detected between property bubbles and the financial problems suffered by credit institutions, leaving the door open to future studies in this field.