Can agricultural credit scoring for microfinance institutions be implemented and improved by weather data?

Ulf Römer (Department for Agricultural Economics and Rural Development, University of Göttingen, Göttingen, Germany)
Oliver Musshoff (Department for Agricultural Economics and Rural Development, University of Göttingen, Göttingen, Germany)

Agricultural Finance Review

ISSN: 0002-1466

Publication date: 5 February 2018

Abstract

Purpose

In recent years, the application of credit scoring in urban microfinance institutions (MFIs) became popular, while rural MFIs, which mainly lend to agricultural clients, are hesitating to adopt credit scoring. The purpose of this paper is to explore whether microfinance credit scoring models are suitable for agricultural clients, and if such models can be improved for agricultural clients by accounting for precipitation.

Design/methodology/approach

This study merges two data sets: 24,219 loan and client observations provided by the AccèsBanque Madagascar and daily precipitation data made available by CelsiusPro. An in- and out-of-sample splitting separates model building from model testing. Logistic regression is employed for the scoring models.

Findings

The credit scoring models perform equally well for agricultural and non-agricultural clients. Hence, credit scoring can be applied to the agricultural sector in microfinance. However, the prediction accuracy does not increase with the inclusion of precipitation in the agricultural model. Therefore, simple correlation analysis between weather events and loan repayment is insufficient for forecasting future repayment behavior.

Research limitations/implications

The results should be verified in different countries and climate contexts to enhance the robustness.

Social implications

By applying scoring models to agricultural clients as well, all clients can benefit from an improved risk assessment (e.g. faster decision making).

Originality/value

To the best of the authors’ knowledge, this is the first study investigating the potential of microfinance credit scoring for agricultural clients in general and for Madagascar in particular. Furthermore, this is the first study that incorporates a weather variable into a scoring model.

Keywords

Citation

Römer, U. and Musshoff, O. (2018), "Can agricultural credit scoring for microfinance institutions be implemented and improved by weather data?", Agricultural Finance Review, Vol. 78 No. 1, pp. 83-97. https://doi.org/10.1108/AFR-11-2016-0082

Download as .RIS

Publisher

:

Emerald Publishing Limited

Copyright © 2018, Emerald Publishing Limited


1. Introduction

The competition in urban microfinance sectors is high, and various microfinance institutions (MFIs) are often vying for the same clients (Caudill et al., 2009). This high level of competition forces MFIs into more cost-saving behaviors (Copestake, 2007). In this context, the risk assessment of loan applicants is becoming the focus of lenders (Prior and Argandoña, 2009). Compared to conventional banking, which relies mainly on collateral and business documentation, the cash flow-based approach of microfinance requires verification of client information prior to loan disbursement, which is time-consuming and thus costly (Armendáriz de Aghion and Morduch, 2000).

In order to decrease evaluation costs, MFIs introduced credit scoring (Bumacov et al., 2014). Credit scoring is a statistical method used to forecast the risk of a single client[1]. Thereby, a link between certain loan applicant characteristics and loan repayment behavior is established. This information is later used to predict the potential occurrence of a pre-defined event, such as a loan default, based on the characteristics of a new loan applicant (Schreiner, 2004). Once the probability of a loan default is estimated, clients are assigned to a certain risk category. In this way, credit scoring has the potential to lower operational costs by assisting loan officers in decision making (Bumacov et al., 2014; De Cnudde et al., 2015; Dinh and Kleimeier, 2007; Ince and Aktan, 2009; Schreiner, 2004). Credit scoring is supported by computers and thus is fully automated (De Cnudde et al., 2015). At the same time, scoring is able to handle large numbers of loans (Ince and Aktan, 2009).

Due to all its advantages, credit scoring became popular in semi-urban and urban MFIs (e.g. Schreiner, 2004). However, in rural areas where agriculture clients predominate, MFIs are hesitating to adopt credit scoring (Wenner et al., 2007). A possible explanation for this hesitation might be a lack of knowledge. In this context (agricultural) scoring models could contribute to improving the risk assessment of rural MFIs. Furthermore, as semi-urban and urban microfinance lenders extend their business to rural areas, their risk assessment should be adapted to granting more loans to agricultural clients, e.g. through specific scoring models. In the past, semi-urban and urban MFIs hesitated to lend to the agricultural sector because it is associated with a higher risk (de Nicola, 2015; Weber and Musshoff, 2012). With this in mind, agricultural scoring models could be beneficial for expanding MFIs as well.

One reason why lenders hesitate to lend to agricultural businesses is their exposure to external production factors (de Nicola, 2015). External factors, such as weather conditions, are found to influence the creditworthiness of a borrower (Castro and Garcia, 2014). It is predicted that due to a changing climate, the production risk of agriculture is even increasing in the future (Finger and Schmid, 2008). An important weather variable for agriculture which could capture weather conditions in a credit scoring model is precipitation (Barnett and Mahul, 2007). To the best of our knowledge, the estimation of an agricultural scoring model in general and the inclusion of precipitation in a scoring model in particular are both absent in the microfinance literature.

Therefore, the main idea of this paper is to design an agricultural-specific credit scoring model. Furthermore, this study aims to contribute to the empirical literature of microfinance credit scoring by the comparison of the performance of the agriculture and non-agricultural model. In addition, we attempt to refine the agricultural scoring model through accounting for precipitation as an external production factor. This analysis is conducted using data from Madagascar, which has an economic and social situation typical for Africa (Minten et al., 2009). MFI and precipitation data are merged and used to estimate loan default by a logistic regression model.

There is a broad literature about agricultural scoring dating back to the last century, e.g. see Turvey and Brown (1990). However, many studies focus on credit scoring of agricultural financial institutions or their portfolio as such, e.g. see Behrens and Pederson (2007) or Gustafson et al. (2005). In this context, Castro and Garcia (2014) mention the need to take climate condition into consideration when evaluating loan portfolios of agricultural or rural banks. In our study the main focus is solely on credit scoring of non-agricultural and agricultural customers instead. In the literature of customer credit scoring the overall aim was to determine characteristics that can be linked to loan default, e.g. see Gallagher (2001), Savitha and Kumar (2016) and Zech and Pederson (2003). These studies have the following in common: they apply logistic regression, though multinomial and OLS regression was used in addition by Savitha and Kumar (2016) and Zech and Pederson (2003), respectively. Data sources include a questionnaire, farm and loan data with sample sizes ranging from 40 to 877 observations (Gallagher, 2001; Dixon et al., 2011). Our study also applies logistic regression using loan data from a commercial MFI containing 24,219 observations in total. However, we extend the literature by incorporating precipitation data into the scoring model. Furthermore, we are shifting the focus from linking single characteristics to loan default toward the ability of the scoring model to predict loan default.

Results indicate that microfinance agricultural scoring models predict repayment performance similarly well compared to non-agricultural scoring models. However, the inclusion of precipitation data into the agricultural scoring model does not improve the prediction accuracy.

The remainder of this paper is structured as follows: in the next section, a literature review leading to the research hypotheses is provided. A description of the data set is given in Section 3. In Section 4, the model building procedure is presented along with the underlying logistic regression models. This is followed by the results and discussion in Section 5. Finally, Section 6 contains a conclusion and suggestions for further research.

2. Literature review and hypotheses

Microfinance credit scoring does indeed seem to work well and improve risk management systems. First, scoring increases efficiency of loan assessment (Bumacov et al., 2014). Bumacov et al. (2014) summarized credit scoring as an option to increase the productivity of loan officers in MFIs. Second, Schreiner (2004) highlighted the advantages of credit scoring models for increasing objectivity and being able to reflect complex causal relationships despite the multiple influences on credit risk. Third, both De Cnudde et al. (2015) and Van Gool et al. (2012) mentioned the advantage of scoring as an automated process which assists the loan officer as a refinement tool in the process of lending decisions. This combination of statistical tools and human best practices diminishes credit risk and loan default (Van Gool et al., 2012). The improvement of risk management is eventually reflected in cost reduction (Ince and Aktan, 2009; Schreiner, 2004).

Perhaps these advantages are the reason why scoring seems to become more and more popular in the microfinance sector. In this context, Bumacov et al. (2014) conducted an online survey in which they estimated the prevalence of scoring. Out of 405 MFI’s who participated in their survey, 403 stated that they apply some type of credit scoring, whereas only two MFIs described that they are currently not using scoring due to bad experiences with it. A broad credit scoring literature already exists for developing countries. The geographical coverage of this literature includes Africa (Kammoun and Triki, 2016; Kinda and Achonu, 2012), Asia (Dinh and Kleimeier, 2007), Eastern Europe (Van Gool et al., 2012) and Latin America (Blanco et al., 2013; Schreiner, 2004). In all regions, scoring models are found to be suitable to either determine credit risk or support loan officers in decision making (Blanco et al., 2013; Dinh and Kleimeier, 2007; Kammoun and Triki, 2016; Kinda and Achonu, 2012; Schreiner, 2004; Van Gool et al., 2012).

However, scoring seems to be less common in the agricultural microfinance sector. Wenner et al. (2007) examined the risk management of 42 MFIs in Latin America that are engaged in agricultural lending, and revealed that only one MFI in their analysis applies a scoring model. Agricultural loans substantially differ from non-agricultural loans in their repayment capacity. Seasonality affects the agricultural sector, especially for plant growers. The time gap between capital investment (seeds, fertilizer, etc.) and revenues (harvesting time) is challenging because it does not fit standard microfinance lending products. To address farmers’ needs, some MFIs offer loan products in the form of flexible loans (Field and Pande, 2008). In addition, microfinance credit scoring cannot work as a black box. Hence, for the introduction of a scoring model, an agricultural loan officer requires training on agricultural business cycles as well as scoring itself (Swinnen and Gow, 1999). Perhaps this might be seen as an unnecessary inflation of the already complicated lending process, which has been a criticism of scoring in the past.

However, credit scoring could be a necessary innovation in order to expand microfinance in rural areas (Morvant-Roux, 2011). Outside of the agricultural sector, there is already an increasing interest in applying scoring as a risk-management tool (Bumacov et al., 2014). We argue that credit scoring will be an effective risk-management tool for agricultural loans in the microfinance sector too, given that a sufficiently high prediction accuracy can be achieved. Ultimately, this would be a decision support tool to lower lending costs in rural areas. Therefore, we investigate whether credit scoring models for agricultural loans are able to predict default risk correctly. Our first hypothesis is:

H1.

“Equality”: the prediction accuracy of microfinance credit scoring models for agricultural loans is as good as for non-agricultural loans.

In the microfinance sector, lending to agricultural and rural clients is perceived as risky (Fernando, 2007). In this context, Weber and Musshoff (2012) investigated whether agricultural lending is indeed more risky than non-agricultural lending. The research was based on the example of a Tanzanian MFI. Weber and Musshoff (2012) showed that agricultural clients do face obstacles in accessing loans, which confirms the initial perception that these clients are riskier. However, Weber and Musshoff (2012) found agricultural loans actually show a better repayment performance than non-agricultural loans. This finding is in line with Baklouti (2014) and Van Gool et al. (2012), who both reported a better repayment performance of agricultural loans.

The negative perception of agriculture, however, may originate from its seasonality and external production risk. For a crop farmer with seasonal cash-flows, frequent repayments starting soon after loan disbursement are problematic (Fernando, 2007). This volatility in the business cycle is observed as a threat to repayment (de Nicola, 2015). Additionally, experiencing a low yield or a crop failure can aggravate the situation even further. The underlying reasons for unpredictable agricultural output are external factors such as pests, diseases, and extreme weather like drought and flood. De Nicola (2015) mentions that the risk structure of agricultural businesses is the reason why MFIs are reluctant to lend to them. For instance, Castellani and Cincinelli (2015) emphasized that droughts negatively affect most African MFIs and even put rural MFIs’ sustainability at risk. Collier et al. (2011) investigated the effect of the weather event el Niño on loan portfolios, and found that this weather pattern causes repayment trouble and increased loan defaults. Furthermore, de Nicola (2015) found that climatic factors explain variation in loan default. In addition, Castro and Garcia (2014) also found a significant effect of climatic factors on default of agricultural loans. In summary, even though farmers show a good repayment performance, the direct dependence of production on external weather factors leads to the perception that agriculture is riskier than other businesses.

Additionally, climate change is likely to exacerbate this situation. There is a growing concern about climate change affecting agricultural production (Khandker and Koolwal, 2016). Weather patterns such as heat waves and heavy precipitation are predicted to become even more volatile and extreme (Coumou and Rahmstorf, 2012). In general, yield levels in Sub-Saharan Africa are expected to fall (Schlenker and Lobell, 2010). In Madagascar in particular, the production of the staple crops maize and rice is predicted to decrease due to climate change (Lobell et al., 2011). In summary, weather patterns can be linked to agricultural loan default, and extreme weather events are likely to increase the rate of default.

Castro and Garcia (2014) emphasized the need of banks to manage common risks to agriculture through quantitative risk management. The weather affects financial indicators which then affect the loan default. Therefore, the effect of weather on loan repayment is rather an indirect one. However, this indirect effect on financial indicators might be delayed. A weather shock a few months ago might influence the cash flow only during the harvest season. Therefore, the weather bears the potential to predict financial key indicators which eventually can be linked to loan default. In Madagascar, precipitation is found to be a good proxy for weather-induced credit risk, which outperforms other weather measurements such as temperature (Pelka et al., 2015). Hence, we utilize precipitation as an explanatory variable in a scoring model. It is expected that by incorporating weather information, the predictive power of the scoring model increases, while the agricultural production risk covered by the MFI declines. Therefore, our second hypothesis is:

H2.

“Weather impact”: incorporating weather variables will improve credit scoring for agricultural clients.

3. Data

This study focuses on Madagascar because financial services are mainly offered in urban areas, and MFIs started expanding their business to the rural areas only in recent years. Hence, rural areas are still largely unbanked. Furthermore, the agricultural sector in Madagascar is the major source of employment and an important contributor to the country’s GDP. The situation in Madagascar is typical for other African countries (Minten et al., 2009).

Historical records of an operating MFI containing client information is a pre-requirement for any credit scoring model (Bumacov et al., 2014; Mileris and Boguslauskas, 2010). These MFI records can be enriched or linked to further information, such as soft information from social networks (De Cnudde et al., 2015) or mobile phone usage (Björkegren and Grissen, 2015), in order to improve scoring models or to explain underlying mechanisms of repayment behavior. In our case we extend the information for agricultural clients with weather data. The two underlying data sets used to investigate our hypotheses were provided by the AccessBank Madagascar (ABM) and CelsiusPro. The ABM, a commercial MFI operating in Madagascar, provided us with loan and client data, while CelsiusPro, an insurance company which offers its services globally, provided us with the necessary precipitation data for Madagascar.

The ABM started its business in 2007 in the capital Antananarivo. Currently, their network comprises of 19 branches and is reaching into rural and farm-based areas. The ABM offers only individual loans to clients rather than group loans. To enable a comparison, both agricultural loans and non-agricultural loans are used in the analysis. Loan and client information from the ABM were extracted from the management information system and cover the time period from November 2010 to January 2015. However, since client information, i.e. socio-economic data, is entered manually into the system, data cleaning was necessary. During the data cleaning, obvious errors, e.g. under 18 years of age, and observations with missing values were excluded. Additionally, unfinished loans were excluded from the analysis to achieve consistent and comparable repayment rates. The total number of loans used in the analysis is 24,219, of which 21,831 are non-agricultural and 2,388 are agricultural loans.

The socio-economic characteristics of agricultural and non-agricultural clients are shown in Table I. The age for both groups is similar. Additionally, Table I shows that male clients dominate agricultural loans, while most non-agricultural loans are disbursed to female clients. Furthermore, it is interesting that agricultural clients have a lower income but higher working experience compared to non-agricultural clients.

Table II shows the loan characteristics of agricultural and non-agricultural loan products. It is noteworthy that only 8 of the 19 branches offer agricultural loans. In addition, financial indicators, e.g. applied loan amount, collateral, and income, seem to be lower on average for agricultural clients than for non-agricultural ones. By far, the majority of agricultural investments are put toward crop cultivation rather than livestock production.

Precipitation data are recorded by official weather stations or satellite-based systems. The data set contains daily precipitation which was matched to the location of each agricultural lending branch. Precipitation data were available from the year 1990 until 2015. Figure 1 shows the average annual distribution of precipitation for each branch. On average, precipitation between branches is quite homogenous. The areas can be characterized as having a dry season during May through September and a wet season during October through April.

After merging the MFI data with the precipitation information, it was necessary to divide the data set into two independent subsamples for the purpose of building and testing the scoring model properly. The separation of the data set is not random; rather, information is sorted by the disbursement date and then divided into an older and a more recent data set. The first sample, referred to as in-sample data, is used for model building and contains 70 percent of the loans. The remaining 30 percent, referred to as out-of-sample data, are then later used for statistically testing the developed scoring models (Tasche, 2005). This procedure reflects a practical application. Under real conditions, the scoring model will always rely on already available (old) data to estimate the risk of a new loan application (Schreiner, 2004).

4. Empirical model building

In the literature, there is a great variety of scoring methods. Madhavi and Radhamani (2014) report that support vector machines had the highest accuracy in their study, while Baklouti (2014) advocates a classification and regression tree, which outperforms discriminant analysis and logistic regression. In contrast, Cubiles de la Vega et al. (2013) compared classification trees, ensemble methods, linear and quadratic discriminant analysis, logistic regression, multilayer perceptron, and support vector machines, and found that multilayer perceptron performs the best. This is in line with Blanco et al. (2013), who compared linear and quadratic discriminant analysis, logistic regression and multilayer perceptron. Their results also show that the multilayer perceptron performs the best. However, these results are contradicted by two other studies: Kammoun and Triki (2016) found that logistic regression outperforms multilayer perceptron, and Mileris and Boguslauskas (2010), who compared discriminant analysis, logistic regression, and multilayer perceptron, show that logistic regression outperforms the other two. In the literature scoring model recommendations vary widely. According to Abdou and Pointon (2011), who reviewed 214 studies on credit scoring, there is no single best scoring technique for all circumstances. However, logistic regression is the dominant recommendation in the field of microfinance (Kammoun and Triki, 2016; Mileris and Boguslauskas, 2010). Furthermore, logistic regression is also preferable because of its simplicity (Olagunju and Ajiboye, 2010). Therefore, this study utilizes logistic regression[2].

The aim of every scoring model is to separate good from bad borrowers (Mileris and Boguslauskas, 2010). Therefore, we need to define a good and a bad borrower. This is usually done using days in arrears for overdue loans; however, the number of days can vary. For instance, a classification of 1, 30 or 90 days in arrears is commonly applied in the developing country context (Pelka et al., 2015). For many banks in developing countries, 1 day in arrears is perceived as signifying a reliable borrower, while 30 days in arrears is already seen as being too costly for the bank. Therefore, as a compromise, the scoring literature in developing countries mostly use 15 days in arrears to define a loan as bad (Baklouti, 2014; Blanco et al., 2013; Cubiles de la Vega et al., 2013; Schreiner, 2004). We follow the literature and adopt the definition of 15 days in arrears for a bad loan[3].

The selection of the independent variables follows a stepwise selection process, considers expert knowledge commonly used in credit scoring, and is only based on observations from the in-sample data (Hand and Henley, 1997; Van Gool et al., 2012). This is done separately for agricultural and non-agricultural loans since some variables, e.g. “Sector of credit: livestock,” are only applicable for agricultural loans. The limiting factor for including variables in the selection process is simply their availability (Abdou and Pointon, 2011). We therefore can only consider variables collected by the MFI during the loan application process. Additionally, we apply quadratic transformation to variables. For all variables, the receiver operating characteristic (ROC) curve is estimated and variables are ranked in accordance with the area under the curve (AUC). The ROC curve is estimated by plotting the share of good loans approved (sensitivity) against the share of bad loans approved (1-specificity) together into a diagram. The AUC is the area under the ROC curve, a larger area indicates a higher share of good loans approved. The AUC is a typical a measurement of classification accuracy (Blanco et al., 2013; Van Gool et al., 2012). Only variables which positively affect the AUC and have a p-value below 10 percent are kept in the scoring model. Categorical variables are kept as long as one category fulfills these requirements, while quadratic terms are dismissed if they have a p-value above 10 percent.

Table III summarizes the selected variables for the agricultural and non-agricultural models. The majority of selected variables are similar, while the agricultural model utilizes 11 variables vs 10 variables for the non-agricultural model. However, there is no recommendation regarding the optimal number of variables (Abdou and Pointon, 2011). In this context, Abdou and Pointon (2011) report that applied scoring models use about 3-20 variables; therefore, our models appear to be typical.

To investigate our second hypothesis, we need to incorporate weather into our agricultural scoring model. As in the literature, we utilize accumulated precipitation data as a proxy for weather. Possible accumulation periods include single and multiple months (Barnett and Mahul, 2007; Berg and Schmitz, 2008; Pelka et al., 2015). This study relies on the following three types of variables which all use the application date of the loan as their reference time: (i) the accumulated rainfall over the last month (e.g. the accumulated rainfall in July when the application date is in August). Then the time horizon is expanded to include the accumulated rainfall over the last two months and so on, until it considers the total accumulated rainfall over the last 12 months. This produces 12 variables containing accumulated precipitation of 1 month up to 12 months. The idea behind this variable is that rainfall close to the date of loan disbursement may influence the ongoing or future production. (ii) The accumulated rainfall in a specific month (e.g. precipitation of last January, even when the application date is in August). This is done for each month during the year, resulting in 12 variables containing precipitation of a single month. The idea behind this is that seasonal production cycles may be subject to weather events taking place at a specific time of year. (iii) This variable is similar to (ii), but considers yearly quarters instead of single months (e.g. the accumulated precipitation in the first quarter of the year is considered as a variable even when today is in August). This treatment produces four additional variables.

For all three types of variables, quadratic terms are also considered in order to capture non-linear patterns. Furthermore, Barnett and Mahul (2007) describe the effect that extreme weather can have. In our study, extreme weather is defined as events which exceed the ten year standard deviation of precipitation. Therefore, the ten year standard deviation for all three types of the aforementioned variables is estimated. Three dummy variables then capture if the precipitation exceeds the standard deviation due to extremely high, low, or high and/or low precipitation. In total, 140 different weather variables are considered. The relatively high number of variables assures that no effect remains hidden, but makes a simple presentation in a table difficult. Each variable is then solely tested for increasing the AUC and significance. Expert knowledge is utilized in the final variable selection to include a reoccurring effect of weather patterns on production. Analysis shows that the dry season seems to be a seasonal factor of importance. Therefore, the variable precipitation in the third quarter, which represents this effect best, is used as the weather variable.

Three scoring models were then created. The agricultural model (Model 1) and non-agricultural model (Model 2) are presented in Equations (1) and (2), respectively. Equation (3) presents Model 1 with an extension to include the weather variable, and is referred to as Model 3:

(1) Y i = β 0 + β 1 c a g r o , i + β 2 l a g r o , i + u i
(2) Y i = β 0 + β 1 c i + β 2 l i + u i
(3) Y i = β 0 + β 1 c a g r o , i + β 2 l a g r o , i + β 3 w i + u i
where Y is a dummy variable that takes the value of 1 for a bad loan and is 0 otherwise for borrower i. The constant is denoted by β0, while β1 and β2 represent parameter vectors. The vector of borrower characteristics is represented by c which contains the socio-economic characteristics, and the vector of loan characteristics is represented by l, while the index agro indicates the agricultural variable set. β3 is a parameter for the weather variable indicated by w. The error term is denoted by u. The three estimated scoring models can be found in Table AI.

Stability is estimated by comparing the in- and out-of-sample AUC values. The more similar the values, the more stable the model, and vice versa (Van Gool et al., 2012). The prediction accuracy of a model is indicated by its out-of-sample AUC value. For comparing the prediction accuracy of the three models, a χ2 test is employed. In addition to the AUC as a measure of model accuracy, some studies also report the misclassification cost of a model. The idea is that types 1 and 2 errors cause different costs to the MFI. Most studies apply a ratio of 1:5 in adherence to the recommendation by West (2000). However, this ratio was designed for German credit data rather than microfinance; thus, the ratio does not take into account losses associated with future loans, and therefore ignores the loan cycle in microfinance (Banerjee et al., 2015; West, 2000). Hence, we solely rely on the AUC as a measure of model accuracy.

5. Results and discussion

The in- and out-of-sample ROC curves with the respective AUC values are presented for each model in Figure 2. Overall, as expected, the in-sample AUC values always score higher than the out-of-sample AUC values. However, the magnitude of the differences is in the range of those presented in the literature (Van Gool et al., 2012). Hence, we evaluate our models as stable. The overall prediction accuracy of our out-of-sample AUC is on the lower end when compared to the literature. However, reported AUC values vary largely across different studies; therefore, reference values are to be considered with caution (e.g. Baklouti, 2014; Blanco et al., 2013; Van Gool et al., 2012).

For investigating our H1 “Equality” we compare the out-of-sample performance of Models 1 and 2 using a χ2 test. The result (χ2=0.08, p-value =0.77) suggests that the AUC of the two models are not statistically significant different. This shows that microfinance scoring models have similar prediction accuracy for agriculture and non-agricultural clients. Consequently, H1 “Equality” can be accepted.

This finding implies that, presuming sufficient observations, credit scoring models could also be designed for agricultural clients. Hence, credit scoring can be part of the innovation which is needed to expand microfinance lending into rural areas (Morvant-Roux, 2011). This does not change the fact that agricultural clients still need special lending products to address their seasonal business cycle (Weber et al., 2014).

The visual differences and the lower stability of Model 1 compared to Model 2 might be due to a higher number of observations in Model 2, which generally improves the scoring model (Schreiner, 2004). Considering the overall loan portfolio of an MFI, agricultural clients are usually a subgroup and consequently their share is smaller. Therefore, these results shows that even under the prevalent circumstances, scoring still works well for the agricultural sector.

For examining H2 “Weather impact” we compare the out-of-sample performance of Models 1 and 3 using a χ2 test. The result (χ2=0.16, p-value =0.69) suggests that the AUC of the two models are not statistically significant different. This result implies that the incorporation of an additional weather variable does not statistically increase the performance of the agricultural scoring model. Thus, H2 “Weather impact” can be rejected.

This result shows that the dependency between agricultural production, precipitation, and loan repayment (Pelka et al., 2015), and the long lasting effect of weather events (Dercon, 2004) are insufficient to increase the prediction accuracy of our agricultural scoring model. This situation might change over the coming years as yield levels in Sub-Saharan Africa are expected to fall due to climate change (Schlenker and Lobell, 2010). Furthermore, we cannot rule out the possibility that the incorporation of weather variables into a scoring model might work in a different context or region.

Theoretically, this result could be driven by an unbalanced number of weather events between the in-sample and out-of-sample data sets (e.g. multiple extreme weather events in the in-sample but none in the out-of-sample data set). However, in reviewing the precipitation data, we do not observe such a pattern.

Model 3 has the highest in-sample and lowest out-of-sample AUC. This shows that an additional variable which increases the in-sample AUC does not necessarily have a positive effect on the out-of-sample AUC. Furthermore, this implies that very careful variable selection is required. Schreiner (2004) argues that when a scoring model with few variables works, it should work even better with more variables. There seems to be no truth in the statement when considering weather variables such as precipitation.

6. Conclusion

Agricultural clients are often associated with posing a higher level of risk to banks. At the same time, credit scoring models, which have been wildly applied by urban MFIs as a risk-assessment tool, are not estimated for agricultural clients specifically. In addition, rural MFIs, which mainly lend to agricultural clients, are hesitating to adopt credit scoring. Therefore, this paper aims to investigate whether credit scoring models can also be applied to agricultural clients. Furthermore, this paper examines whether such agricultural scoring models can be improved by incorporating weather patterns.

For our analysis, we utilize loan and client data provided by the ABM, and precipitation data from CelsiusPro. Data were divided chronologically into an older in-sample and more recent out-of-sample data sets. The in-sample data set was used for model building, while the out-of-sample data set was for testing the models.

The AUC value is applied as a measure of model accuracy. Our results indicate that credit scoring models work equally well for agricultural and non-agricultural clients. This holds true even though the number of observations of agricultural clients was modest compared to the overall loan portfolio. Therefore, this paper supports the implementation of credit scoring models for rural MFIs, presuming a successful test is conducted first. Furthermore, the incorporation of precipitation into the scoring model does not improve its performance significantly in our case. However, we cannot exclude the possibility that weather variables under different circumstances can contribute to credit scoring accuracy in general.

These results are interesting for agricultural lenders as well as for scientists. On the one hand, our study demonstrates the usefulness of credit scoring for agricultural clients. On the other hand, it also shows the current limitations. Further research is therefore necessary to clarify if these findings hold true for different geographical areas and under different climatic conditions. In addition, future research could investigate the effect of extreme weather events like severe droughts and floods. It might also be interesting to research if, rather than precipitation, an evaporation index can contribute to improved model accuracy.

Figures

Average precipitation for branches providing agricultural loans in the years 1990 to 2015

Figure 1

Average precipitation for branches providing agricultural loans in the years 1990 to 2015

In- and out-of-sample results of the ROC curves and AUC values for Model 1-3

Figure 2

In- and out-of-sample results of the ROC curves and AUC values for Model 1-3

Socio-economic characteristics of clients

Variable Description Agricultural mean Non-agricultural mean
Age Age of applicant in years 43.74 (10.59) 41.34 (10.14)
Gender 1 if applicant is female; 0 otherwise 0.26 0.57
Income Monthly business and household income in thousands of Malagasy Ariary 9,792 (20,200) 51,500 (131,000)
Marital status
Single 1 if applicant is single; 0 otherwise 0.05 0.08
Married 1 if applicant has a spouse; 0 otherwise 0.91 0.85
Divorced 1 if applicant is divorced; 0 otherwise 0.02 0.03
Other 1 if marital status is unknown; 0 otherwise 0.03 0.04
No. family members Number of family members 4.96 (1.98) 3.96 (1.65)
Resident 1 if applicant is a resident; 0 otherwise 0.99 0.99
Working experience Working experience in current profession in years 16.42 (8.66) 9.90 (9.90)
Number of observations 2,388 21,831

Note: For respective variables, SD given in parentheses

Loan characteristics of clients

Variable Description Agricultural mean Non-agricultural mean
Applied loan amount Applied loan amount in thousands of Malagasy Ariary 1,481 (1,543) 2,683 (3,181)
Assets Assets in thousands of Malagasy Ariary 3,011 (3,998) 5,862 (19,200)
Branch
 1 1 if applicant is from branch 1; 0 otherwise 0.06
 2 1 if applicant is from branch 2; 0 otherwise 0.15
 3 1 if applicant is from branch 3; 0 otherwise 7.41e-03
 4 1 if applicant is from branch 4; 0 otherwise 0.07
 5 1 if applicant is from branch 5; 0 otherwise 0.01
 6 1 if applicant is from branch 6; 0 otherwise 0.08
 7 1 if applicant is from branch 7; 0 otherwise 0.02 0.09
 8 1 if applicant is from branch 8; 0 otherwise 0.24 0.09
 9 1 if applicant is from branch 9; 0 otherwise 0.18 0.04
10 1 if applicant is from branch 10; 0 otherwise 0.22 0.04
11 1 if applicant is from branch 11; 0 otherwise 0.08
12 1 if applicant is from branch 12; 0 otherwise 0.04
13 1 if applicant is from branch 13; 0 otherwise 0.20 0.02
14 1 if applicant is from branch 14; 0 otherwise 2.93e-03 0.04
15 1 if applicant is from branch 15; 0 otherwise 0.04
16 1 if applicant is from branch 16; 0 otherwise 0.03
17 1 if applicant is from branch 17; 0 otherwise 0.14 0.02
18 1 if applicant is from branch 18; 0 otherwise 1.94e-03
19 1 if applicant is from branch 19; 0 otherwise 8.38e-04 4.31e-05
Collateral Collateral in thousands of Malagasy Ariary 2,945 (3,652) 5,998 (10,400)
Debt Debt to other bank in thousands of Malagasy Ariary 57 (294) 162 (1,194)
Deposit Deposit in the bank account in thousands of Malagasy Ariary 9 (107) 85 (1,055)
Disbursed loan amount Granted loan amount in thousands of Malagasy Ariary 1,061 (1,031) 1,981 (2,680)
No. installments Number of loan installments 11.53 (2.31) 13.23 (3.95)
Purpose of credit
Liquidity 1 if loan purpose is liquidity; 0 otherwise 0.87 0.73
Investment 1 if loan purpose is investment; 0 otherwise 0.05 0.12
Liquidity and investment 1 if loan purpose is liquidity and investment; 0 otherwise 0.08 0.13
Others 1 if loan purpose is unknown; 0 otherwise 2.93e-03 0.03
Repayment capacity Applicants repayment capacity in thousands of Malagasy Ariary 3,545 (7,440) 5,133 (14,200)
Repeat 1 if applicant had a loan before; 0 otherwise 0.35 0.51
Sector of credit
Crops 1 if specialized in plant cultivation; 0 otherwise 0.95
Livestock 1 if specialized in animal production; 0 otherwise 0.04
Others 1 if specialization is unknown; 0 otherwise 0.01
Number of observations 2,388 21,831

Note: For respective variables, standard errors given in parentheses

Selected variables for the agricultural and non-agricultural models

Variable Agricultural Non-agricultural
Age (squared) Yes (no) Yes (yes)
Applied loan amount (squared) Yes (yes) Yes (yes)
Assets (squared) Yes (no) Yes (yes)
Branchesa Yes Yes
Collateral (squared) No (no) Yes (yes)
Debt (squared) Yes (no) Yes (yes)
Deposit (squared) Yes (no) Yes (yes)
Gender Yes No
Marital status Yes Yes
No. of installments Yes Yes
Purpose of credit No Yes
Sector of creditb Yes
Working experience (squared) Yes (yes) No (no)

Notes: aBranch availability differs; bvariable is unavailable for non-agricultural clients

Estimation results of the logistic regression for 15 days in arrears

Variable Model 1 Model 2 Model 3
Age −0.01 (0.01) 0.02 (0.01) −0.01 (0.01)
Age squared 4.84e-04*** (1.67e-04)
Applied loan amount 6.19e-7*** (1.05e-7) 2.87e-07*** (2.14e-08) 6.02e-07*** (1.06e-07)
Applied loan amount squared −3.86e-14*** (9.80e-15) −8.84e-15*** (1.07e-15) −3.80e-14*** (9.82e-15)
Assets −1.53e-07*** (3.87e-08) −1.93e-08*** (3.37e-09) −1.50e-07*** (3.87e-08)
Assets squared 4.39e-17*** (1.07e-17)
Branch
 1 −0.73*** (0.21)
 2 −0.33* (0.20)
 3 −1.08*** (0.29)
 4 −0.21 (0.20)
 5 −0.98*** (0.21)
 6 −0.65*** (0.21)
 7 0.71* (0.43) −1.03*** (0.21) 0.64 (0.43)
 8 −0.41 (0.28) −0.17 (0.20) −0.47* (0.28)
 9 −0.19 (0.28) −0.55*** (0.21) −0.24 (0.29)
10 −0.43 (0.28) −0.50* (0.29)
Collateral −3.40e-08*** (7.24e-09)
Collateral squared 2.87e-16*** (7.58e-17)
Debt 8.08e-07** (3.17e-07) 2.20e-07*** (4.45e-08) 8.11e-07 (3.19e-07)
Debt squared −1.14e-14*** (3.16e-15)
Deposit −1.20e-05 (7.70e-06) −4.57e-06*** (6.52e-07) −1.22e-05** (7.69e-06)
Deposit squared 7.33e-14*** (1.09e-14)
Gender 0.35** (0.14) 0.35** (0.14)
Marital status
Single −1.00** (0.45) −0.17 (0.12) −1.01** (0.45)
Married −1.01*** (0.34) −0.34*** (0.10) −1.00*** (0.34)
Divorced −0.91 (0.61) 0.24* (0.14) −0.91 (0.62)
Other
No. installments 0.15*** (0.03) 0.07*** (0.01) 0.15*** (0.03)
Purpose of credit
Liquidity 0.20 (0.14)
Investment 0.07 (0.14)
Liquidity and investment 0.26* (0.14)
Others
Sector of credit
Animal −0.68** (0.34) −0.69** (0.34)
Cultivators
Others −0.17 (0.59) −0.09 (0.60)
Weather variablea −0.01* (0.01)
Working experience 0.05* (0.03) 0.05* (0.03)
Working experience squared −1.32e-3* (7.51e-4) 1.30e-3* (7.45e-4)
Constant −1.99 (−1.99) −0.64* (0.35) −1.88*** (0.68)
Number of observations 2,388 21,831 2,388

Notes: For all coefficients, standard errors given in parentheses. aPrecipitation in the third quarter. *,**,***Significant at 10, 5 and 1 percent levels, respectively

Notes

1.

Therefore, scoring is applicable to individual but not group lending.

2.

One might think of weather as an endogenous variable and therefore consider an instrumental variable approach. However, when applying an instrumental variable Probit model the results indicate a low prediction accuracy in our case.

3.

The main results are similar when instead choosing either 30 or 90 days of arrears to define a loan as bad.

Appendix

Table AI

References

Abdou, H. and Pointon, J. (2011), “Credit scoring, statistical techniques and evaluation criteria: a review of the literature”, Intelligent Systems in Accounting, Finance & Management, Vol. 18 Nos 2/3, pp. 59-88.

Armendáriz de Aghion, B. and Morduch, J. (2000), “Microfinance beyond group lending”, Economics of Transition, Vol. 8 No. 2, pp. 401-420.

Baklouti, I. (2014), “A psychological approaching to microfinance credit scoring via a classification and regression tree”, Intelligent Systems in Accounting, Finance & Management, Vol. 21 No. 4, pp. 193-208.

Banerjee, A., Duflo, E., Glennerster, R. and Kinnan, C. (2015), “The miracle of microfinance? Evidence from a randomized evaluation”, American Economic Journal: Applied Economics, Vol. 7 No. 1, pp. 22-53.

Barnett, B.J. and Mahul, O. (2007), “Weather index insurance for agriculture and rural areas in lower-income countries”, American Journal of Agricultural Economics, Vol. 89 No. 5, pp. 1241-1247.

Behrens, A. and Pederson, G.D. (2007), “An analysis of credit risk migration patterns of agricultural loans”, Agricultural Finance Review, Vol. 67 No. 1, pp. 87-98.

Berg, E. and Schmitz, B. (2008), “Weather-based instruments in the context of whole-farm risk management”, Agricultural Finance Review, Vol. 68 No. 1, pp. 119-133.

Björkegren, D. and Grissen, D. (2015), “Behavior revealed in mobile phone usage predicts loan repayment”, working paper, Brown University, Providence, RI, Entrepreneurial Finance Lab, Lima, July 13.

Blanco, A., Pino-Mejías, R., Lara, J. and Rayo, S. (2013), “Credit scoring models for the microfinance industry using neural networks: evidence from Peru”, Expert Systems with Applications, Vol. 40 No. 1, pp. 356-364.

Bumacov, V., Ashta, A. and Singh, P. (2014), “The use of credit scoring in microfinance institutions and their outreach”, Strategic Change, Vol. 23 Nos 7/8, pp. 401-413.

Castellani, D. and Cincinelli, P. (2015), “Dealing with drought-related credit and liquidity risks in MFIs: evidence from Africa”, Strategic Change, Vol. 24 No. 1, pp. 67-84.

Castro, C. and Garcia, K. (2014), “Default risk in agricultural lending, the effects of commodity price volatility and climate”, Agricultural Finance Review, Vol. 74 No. 4, pp. 501-521.

Caudill, S.B., Gropper, D.M. and Hartarska, V. (2009), “Which microfinance institutions are becoming more cost effective with time? Evidence from a mixture model”, Journal of Money, Credit and Banking, Vol. 41 No. 4, pp. 651-672.

Collier, B., Katchova, A.L. and Skees, J.R. (2011), “Loan portfolio performance and El Niño, an intervention analysis”, Agricultural Finance Review, Vol. 71 No. 1, pp. 98-119.

Copestake, J. (2007), “Mainstreaming microfinance: social performance management or mission drift?”, World Development, Vol. 35 No. 10, pp. 1721-1738.

Coumou, D. and Rahmstorf, S. (2012), “A decade of weather extremes”, Nature Climate Change, Vol. 2 No. 7, pp. 491-496.

Cubiles de la Vega, M.D., Blanco Oliver, A., Pino Mejías, R. and Lara Rubio, J. (2013), “Improving the management of microfinance institutions by using credit scoring models based on Statistical Learning techniques”, Expert Systems with Applications, Vol. 40 No. 17, pp. 6910-6917.

De Cnudde, S., Moeyersoms, J., Stankova, M., Tobback, E., Javaly, V. and Martens, D. (2015), “Who cares about your Facebook friends? Credit scoring for microfinance”, Working Paper No. D/2015/1169/018, University of Antwerp, Antwerp.

de Nicola, F. (2015), “Handling the weather insurance, savings, and credit in West Africa”, Working Paper No. 7187, World Bank Group, Washington, DC.

Dercon, S. (2004), “Growth and shocks: evidence from rural Ethiopia”, Journal of Development Economics, Vol. 74 No. 2, pp. 309-329.

Dinh, T.H.T. and Kleimeier, S. (2007), “A credit scoring model for Vietnam’s retail banking market”, International Review of Financial Analysis, Vol. 16 No. 5, pp. 471-495.

Dixon, B.L., Ahrendsen, B.L., McFadden, B.R., Danforth, D.M., Foianini, M. and Hamm, S.J. (2011), “Competing risks models of Farm Service Agency seven‐year direct operating loans”, Agricultural Finance Review, Vol. 71 No. 1, pp. 5-24.

Fernando, N.A. (2007), “Managing microfinance risks: some observations and suggestions”, Asian Journal of Agriculture and Development, Vol. 4 No. 2, pp. 1-22.

Field, E. and Pande, R. (2008), “Repayment frequency and default in microfinance: evidence from India”, Journal of the European Economic Association, Vol. 6 Nos 2-3, pp. 501-509.

Finger, R. and Schmid, S. (2008), “Modeling agricultural production risk and the adaptation to climate change”, Agricultural Finance Review, Vol. 68 No. 1, pp. 25-41.

Gallagher, R.L. (2001), “Characteristics of unsuccessful versus successful agribusiness loans”, Agricultural Finance Review, Vol. 61 No. 1, pp. 20-35.

Gustafson, C.R., Pederson, G.D. and Gloy, B.A. (2005), “Credit risk assessment”, Agricultural Finance Review, Vol. 65 No. 2, pp. 201-217.

Hand, D.J. and Henley, W.E. (1997), “Statistical classification methods in consumer credit scoring: a review”, Journal of the Royal Statistical Society: Series A, Vol. 160 No. 3, pp. 523-541.

Ince, H. and Aktan, B. (2009), “A comparison of data mining techniques for credit scoring in banking: a managerial perspective”, Journal of Business Economics and Management, Vol. 10 No. 3, pp. 233-240.

Kammoun, A. and Triki, I. (2016), “Credit scoring models for a Tunisian microfinance institution: comparison between artificial neural network and logistic regression”, Review of Economics & Finance, Vol. 6 No. 1, pp. 61-78.

Khandker, S.R. and Koolwal, G.B. (2016), “How has microcredit supported agriculture? Evidence using panel data from Bangladesh”, Agricultural Economics, Vol. 47 No. 2, pp. 157-168.

Kinda, O. and Achonu, A. (2012), “Building a credit scoring model for the savings and credit mutual of the Potou Zone (MECZOP)/Senegal”, The Journal of Sustainable Development, Vol. 7 No. 1, pp. 17-32.

Lobell, D.B., Schlenker, W. and Costa-Roberts, J. (2011), “Climate trends and global crop production since 1980”, Science, Vol. 333 No. 6042, pp. 616-620.

Madhavi, A.V. and Radhamani, G. (2014), “Improving the credit scoring model of microfinance institutions by support vector machine”, International Journal of Research in Engineering and Technology, Vol. 3 No. 7, pp. 29-33.

Mileris, R. and Boguslauskas, V. (2010), “Data reduction influence on the accuracy of credit risk estimation models”, Economics of Engineering Decisions, Vol. 21 No. 1, pp. 5-11.

Minten, B., Randrianarison, L. and Swinnen, J.F.M. (2009), “Global retail chains and poor farmers: evidence from Madagascar”, World Development, Vol. 37 No. 11, pp. 1728-1741.

Morvant-Roux, S. (2011), “Is microfinance the adequate tool to finance agriculture?”, in Armendáriz, B. and Labie, M. (Eds), The Handbook of Microfinance, World Scientific Publishing Co. Pte. Ltd, Singapore, pp. 421-436.

Olagunju, F.I. and Ajiboye, A. (2010), “Agricultural lending decision: a Tobit regression analysis”, African Journal of Food, Agriculture, Nutrition and Development, Vol. 10 No. 5, pp. 2515-2541.

Pelka, N., Musshoff, O. and Weber, R. (2015), “Does weather matter? How rainfall affects credit risk in agricultural microfinance”, Agricultural Finance Review, Vol. 75 No. 2, pp. 194-212.

Prior, F. and Argandoña, A. (2009), “Credit accessibility and corporate social responsibility in financial institutions: the case of microfinance”, Business Ethics: A European Review, Vol. 18 No. 4, pp. 349-363.

Savitha, B. and Kumar, K.N. (2016), “Non-performance of financial contracts in agricultural lending: a case study from Karnataka, India”, Agricultural Finance Review, Vol. 76 No. 3, pp. 362-377.

Schlenker, W. and Lobell, D.B. (2010), “Robust negative impacts of climate change on African agriculture”, Environmental Research Letters, Vol. 8 No. 1, pp. 1-8.

Schreiner, M. (2004), “Scoring arrears at a microlender in Bolivia”, Journal of Microfinance, Vol. 6 No. 2, pp. 65-88.

Swinnen, J.F.M. and Gow, H.G. (1999), “Agricultural credit problems and policies during the transition to a market economy in Central and Eastern Europe”, Food Policy, Vol. 24 No. 1, pp. 21-47.

Tasche, D. (2005), “Rating and probability of default validation”, in Liebig, T. (Ed.), Studies on the Validation of Internal Rating Systems, Basel Committee on Banking Supervision, Basel, pp. 28-59.

Turvey, C.G. and Brown, R. (1990), “Credit scoring for a federal lending institution: the case of Canada’s Farm Credit Corporation”, Agricultural Finance Review, Vol. 50 No. 1, pp. 47-57.

Van Gool, J., Verbeke, W., Sercu, P. and Baesens, B. (2012), “Credit scoring for microfinance: is it worth it?”, International Journal of Finance and Economics, Vol. 17 No. 2, pp. 103-123.

Weber, R. and Musshoff, O. (2012), “Is agricultural microcredit really more risky? Evidence from Tanzania”, Agricultural Finance Review, Vol. 72 No. 3, pp. 416-435.

Weber, R., Musshoff, O. and Petrick, M. (2014), “How flexible repayment schedules affect credit risk in agricultural microfinance?”, Working Paper No. 1404, Department of Agricultural Economics and Rural Development, University of Göttingen, Göttingen, April.

Wenner, M., Navajas, S., Trivelli, C. and Tarazona, A. (2007), “Managing credit risk in rural financial institutions in Latin America”, Working Paper (Ref. No. MSM-139), Inter-American Development Bank, Washington, DC.

West, D. (2000), “Neural network credit scoring models”, Computers and Operations Research, Vol. 27 No. 11, pp. 1113-1152.

Zech, L. and Pederson, G. (2003), “Predictors of farm performance and repayment ability as factors for use in risk‐rating models”, Agricultural Finance Review, Vol. 63 No. 1, pp. 41-54.

Acknowledgements

The authors would like to thank Dr Calum Turvey and two anonymous referees for helpful comments and suggestions as well as Dr Ron Weber from the KfW for providing the data and his support. The authors further gratefully acknowledge financial support from Deutsche Forschungsgemeinschaft (DFG).

Corresponding author

Ulf Römer can be contacted at: uroemer@uni-goettingen.de