Corruption, quality of institutions and growth

Arielle Beyaert (Department of Quantitative Methods for Economics and Business, University of Murcia, Murcia, Spain)
José García-Solanes (Fundamentos del Análisis Económico, Universidad de Murcia, Murcia, Spain)
Laura Lopez-Gomez (Department of Quantitative Methods for Economics and Business, University of Murcia, Murcia, Spain)

Applied Economic Analysis

ISSN: 2632-7627

Article publication date: 22 December 2022

Issue publication date: 28 February 2023




This paper aims to apply regression-tree analysis to capture the nonlinear effects of corruption on economic growth. Using data of 103 countries for the period 1996–2017, the authors endogenously detect two distinct areas in corruption quality in which the members share the same model of economic growth.


The authors apply regression tree analysis to capture the nonlinearity of the influences. This methodology allows us to split endogenously the whole sample of countries and characterize the different ways through which corruption impacts economic growth in each group of countries.


The traditional determinants of economic growth have different impacts on countries depending on their level of corruption, which, in turn, confirms the parameter heterogeneity of the Solow model found in other strands of the literature.


The authors apply a new approach to a worldwide sample obtaining novel results.



Beyaert, A., García-Solanes, J. and Lopez-Gomez, L. (2023), "Corruption, quality of institutions and growth", Applied Economic Analysis, Vol. 31 No. 91, pp. 55-72.



Emerald Publishing Limited

Copyright © 2022, Arielle Beyaert, José García-Solanes and Laura Lopez-Gomez.


Published in Applied Economic Analysis. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence maybe seen at

1. Introduction

The effect of corruption on economic growth is a topic that repeatedly comes back and is the object of analysis with renewed methodologies. In this paper, we test the possible impact of two alternative measures of corruption on economic growth by applying regression-tree analysis, completed with instrumental variable estimations, to capture the nonlinear transmission of effects, using a sample of 103 countries for the period 1996–2017.

As documented by Gründler and Potrafke (2019), empirical studies tend to support that corruption reduces economic growth, especially in countries with low investment rates and poor governance. However, the variability of results is very large, and most of them are inconclusive, largely due to the econometric techniques applied, which are often marred with important shortcomings (Campos et al., 2010). Two main econometric concerns have been highlighted in the literature regarding the estimation of the relationship between corruption and economic growth.

The first worry refers to the endogeneity of the institutional variable, in this case, corruption. Much of the early literature report direct correlations between corruption and growth using cross-sectional and panel data, assuming that causality goes exclusively from corruption to growth. However, it is also true that an increase in leaving standards and incomes helps to increase the quality of political institutions and to reduce corruption. The use of instrumental variables in the first decade of the years 2000 to deal with reverse causality – from economic growth to corruption – did not give satisfactory results, as Aidt (2009) points out. More recently, Gründler and Potrafke (2019) have applied dynamic panel data models with country and period fixed effects to address endogeneity problems. However, since this approach includes corruption as an additional explanatory variable within a parameter-invariant linear regression specification, it does not take into account the possible indirect effects that corruption can exert by modifying the production function itself.

The assumption that economic growth is linearly linked to corruption, as seen in a large body of the literature, is the second source of econometric concern. Several recent contributions find that the aforementioned relationship is nonlinear. For example, Swaleheen (2011) shows that the corruption indicator is negatively correlated with economic growth, but the squared indicator is positively correlated with it, which reveals a nonlinear relation between the two variables. Therefore, assuming that the relationship between the two variables is linear leads to biased and nonreliable results.

In this paper, we investigate the whole effect of corruption addressing the two econometric worries indicated above. On the one hand, we assume that, in addition to a potential direct impact on the economy, corruption may also influence the relationship between growth and its other determinants in the production function, thus introducing nonlinearity through parameter heterogeneity into the empirical model. On the other hand, we deal with the endogeneity problem by using two instrumental variables: presample values of corruption and jack-knifed averages of the corruption indicator in the line of Gründler and Potrafke (2019). They are used in instrumental variable estimations of the production function of all the countries that have been identified as sharing the same function.

To address these two questions, we start by applying regression tree analysis, a relatively new econometric method still infrequently used in economics, though it has been previously applied in the computer science area of machine learning. As far as we know, only two contributions have applied the regression-tree method to quantify the role of institutions in economic growth: Minier (2007) and Tan (2010). We complete the approach of those authors by analyzing two channels (direct and indirect) through which institutional variables may affect economic growth, and by applying instrumental variable estimations to address the endogeneity issue. The regression tree analysis departs from the strict linearity framework by allowing the modeling process to take into account the possible indirect effects of institutional quality in the model. Under this procedure, the sample is endogenously split into different subsamples, according to different thresholds of an institutional quality index, to obtain different subgroups of countries with a homogeneous level of institutional quality in each of them. Then, the same growth model is estimated in each of these subsamples separately, dealing at this stage with the endogeneity problem.

We use the Solow equation enlarged with an indicator of corruption and find that although corruption is not a statistically significant variable in the performed regressions, it has an indirect effect on growth by determining the splitting of the sample in two groups of countries that share the same growth model. It turns out that the group of countries with more intense corruption are those with higher coefficients for the explanatory variables of the growth model. Interestingly enough, we obtain very similar results using two different indicators of corruption from two different databases: one measures the level of corruption, whereas the other one measures the control of corruption. We also use indicators of the rule of law as alternative potential splitting variables, and we find that they do not provide significant splitting results. We derive from this that the legal and judicial framework does not seem to affect indirectly economic growth [1].

Our results are consistent with the findings of Durlauf et al. (2001) who argue that there is parameter heterogeneity in the Solow model, and that the empirical literature has not been able to incorporate these differences in parameters. The procedure we adopt in this paper solves the mentioned limitation, allowing us to derive robust evidence that differences in the levels of corruption – which are not taken into account in traditional empirical growth models – can be major causes of the parameter heterogeneity in the Solow model of economic growth.

The rest of the paper is structured as follows: Section 2 provides a brief review of the literature that analyzes the theoretical relationships between institutional quality and economic growth, with a particular emphasis on the effects of corruption. Section 3 describes the methodology used in our study. Section 4 estimates the model and discusses the main results. Finally, Section 5 concludes and derives some policy prescriptions.

2. Literature review

Various contributions to economic growth since the early 2000s highlight the relevance of institutional factors and combine the empirics of economic growth with the institutional approach of North (1990). Dollar and Kraay (2002, 2003) find that trade openness and good institutions positively affect economic growth but without significantly influencing income levels in poorer countries. Rigobon and Rodrik (2005) investigate the effects of trade, institutions and geography on income levels and derive positive effects from the quality of institutions in the sense that when it is included in the estimations, the rest of determinants become less relevant. Alcala and Ciccone (2004) find that institutional quality affects growth by improving both the capital output ratio and the average level of human capital. Rodrik et al. (2004) and Glaeser et al. (2004) suggest that societies can thrive with weak institutions if they accumulate physical capital; this might be the case, for instance, of Russia and China. These authors support the Lipset–Przeworski–Barro view, according to which poor countries grow by accumulating human and physical capital, even under dictatorships, and that a certain level of development is necessary for these countries to improve their institutions.

As far as the influence on growth from a particular institutional indicator, such as corruption, is concerned, contributions can be grouped into two opposing hypotheses. On the one hand, the “grease-in-the-wheels” hypothesis argues that corruption can have a positive impact on economic growth. The reason afforded by Leff (1964) is that in underdeveloped and over-bureaucratized nations, corruption guarantees the viability and success of many investment projects that could not otherwise be carried out. On the other hand, the “sand-in-the-wheels” hypothesis states that corruption unambiguously harms economic growth regardless of the time and degree of development of countries. This last view has received larger support in the empirical literature. Méon and Weill (2010), Campos et al. (2010) and Ugur (2014) provide excellent surveys of the empirical work on the nexus corruption-growth published up to 2010. The last two are based on meta-analysis methods. Gründler and Potrafke (2019) instructively report the main contributions made on this topic after 2011.

As indicated in the introduction, most of the empirical work published over the past two decades is affected by a deficient treatment of the endogeneity of the institutional variable and/or the nonlinearity of the model. In the lines that follow, we review the way in which those problems have been addressed in the recent literature.

As regard endogeneity, Gründler and Potrafke (2019) use jack-knifed regional averages of corruption as a vigorous instrumental variable for corruption to expunge the endogenous components of the data; then they estimate a dynamic panel data model in the vein of Acemoglu et al. (2019) using four lags of gross domestic product (GDP) as instrumental variables and conditioning the effects to three indicators of governance quality. They find that corruption reduces cumulative economic growth significantly, and that this effect is especially pronounced in autocracies and countries with low governance effectiveness and rule of law. This approach includes corruption as an additional explanatory variable within a parameter-invariant linear regression specification, thus imposing a strong homogeneity assumption for the production functions of all countries in the sample. Applying the same method, Sharma and Mitra (2019) obtain that joint effects of regulation and corruption do not seem to be empirically significant for countries from any of the income groups considered in their sample. Aidt et al. (2008) treat corruption as an endogenous variable in a threshold model and find two regimes determined by political institutions (mainly quality of governance regimes).

As far as nonlinearity is concerned, Méon and Sekkat (2005) and Méon and Weill (2010) address this problem including an interaction term – multiplicative variable – between corruption and the quality of governance, in the relevant equation. They find that corruption is consistently detrimental for growth in countries where institutions are effective. Méndez and Sepúlveda (2006) find cross-country evidence of a nonmonotonic relationship between corruption and income, and Swaleheen (2011) discovers that while the corruption indicator negatively influences growth, the square of this index affects growth positively. De Vaal and Ebben (2011) show theoretically that the impact of corruption on growth depends on the level of institutional quality of countries, and Cerqueti et al., (2012) also demonstrate, with the help of a simple theoretical model, that corruption has a nonlinear influence on economic performance. Ahmad et al. (2012) detect a quadratic relationship between corruption and growth. Using a GMM estimation, these authors show that the relationship between corruption and economic growth is inverted U-shaped. Saha and Gounder (2013) also find nonlinearities in the relationship between corrupt behaviors and income by applying the hierarchical polynomial regression. Finally, Saha et al. (2017) use panel fixed effects and the generalized methods of moments model to show that corruption fosters economic growth up to a certain limit, and thereafter it impacts growth negatively [2].

A common feature of the empirical studies mentioned above is that they try to find out the direct effect of institutional indicators on economic growth, carrying out cross-section and panel data regressions in which the institutional variable is included as an additional determinant. Some contributions have shown that the effects of corruption on economic growth occur indirectly and apply the regression tree semiparametric method to endogenously unravel these effects. The procedure consists of detecting an unknown number of sample splits based on multiple control variables [3].

Under the regression tree procedure, researchers try to elucidate whether corruption affects growth by modifying the coefficients of the remaining parameters of the growth equation; i.e. whether corruption introduces parameter heterogeneity into the empirical model. Minier (2007) applies this splitting methodology in a sample of 57 countries for the period 1960–2000 and does not find strong evidence that institutions, proxied by executive constraints, affect growth indirectly by altering the relationship between growth and its other determinants. However, using a more traditional dummy variable approach (and without addressing the endogeneity problem), she does find that institutions quality influences how policy variables, especially trade openness, affect growth. Tan (2010) apply regression tree analysis to show that high-quality institutions contribute to mitigate the negative impact of ethnic fractionalization on growth. However, this author does not take into account the direct role of specific institutional indicators, such as corruption and the rule of law, on economic growth, which is one of the two main focuses of analysis of this paper.

We apply the regression tree method by enlarging the basic growth model of Solow with an indicator of corruption to assess the potentially direct and/or indirect effects of that variable on economic growth. We also use indicators of the rule of law as alternative potential splitting variables. In the next section, we explain the econometric approach that we apply in the empirical estimations, that also addresses the endogeneity issue.

3. Econometric model and methodology

Our main objective is to analyze both direct and indirect impact of corruption on growth. Using as a starting point the traditional growth model used in seminal studies (such as Barro and Sala-i-Martin, 1992 and Mankiw et al., 1992), we assess whether corruption leads to different models across countries. In addition, we augment the traditional model with corruption variables to test their possible direct influence on economic growth.

As far as estimation is concerned, the main novelty with respect to the classical growth model is that we do not estimate one single model for all countries. Instead, we use the regression tree approach, which consists of fitting the same growth model specification to different subgroups of countries of the sample, obtaining different coefficient estimations for each subgroup. This methodology splits the sample in different groups of countries according to endogenously determined threshold values of specific variables called split variables. So, after applying this technique, we obtain different estimations of the same growth model for different groups of countries.

We propose a classical growth equation for each country i belonging to a given subgroup. These subgroups are mutually exclusive and are defined according to the combinations of values of the split variables. To simplify, let us imagine there are two split variables: X1 and X2 and two thresholds: t1 and t2, respectively. Figure 1 illustrates a possible splitting: the whole sample is split into two groups depending on whether the value of X1 is below or above the threshold t1. In addition, the group of countries with X1t1 is split again, but in this case the split variable is X2 and the threshold value is t2. Note that this second splitting could instead have taken place according to a new threshold value, t3 t1 of X1. In the example, we use X2 to illustrate that the methodology allows for the existence of interactions between split variables. This example ends up with three subgroups called, in Figure 1, Model 1, Model 2 and Model 3.

It is also important to note that the groups of countries with X1 > t1 could, in turn, be further split into new groups according to new thresholds for X1 and/or X2. The criterion to stop splitting is to achieve a preset minimum sample size for the final subgroups.

Using the notation of Tan (2010), our growth equation is as follows:

(1) gi=αj+βj0ln(yi0)+βjkln(Iik)+βjnln(ni+δ+ζ)+βjCln(Ci)+εi   for j=1,..,m
where gi is the average annual growth rate of country i measured as the difference in log per capita real GDP between the initial and final year of the sample; yi0 is the initial real GDP value of country i; Iik corresponds to the average ratio of investment to GDP over the period of the study in country; ni + δ + ζ is the sum of the population growth rate of country i over the period, the depreciation rate for physical and human capital (set at the traditional value of 0.05) and the rate of exogenous technological growth; finally, Ci is a variable measuring the corruption indicator in country i.

In this equation, i is the individual country index (with a total of N countries in the full sample) and j is the country group index (with a total of m groups of countries detected in the analysis). This notation reflects that m different models will be estimated: as many as the number of subgroups of countries the method detects; it also indicates that all the countries of a given group share the same model.

In our particular case, we have selected four variables as appropriate candidates to distinguish between different growth models for distinct groups of countries (i.e. to identify and build groups of countries which might share a common model): two indicators related to corruption and two indicators that take into account the legal and the judicial framework. Our estimation procedure determines endogenously whether these variables define different growth models for different groups of countries and classifies each country i in one of the m groups, depending on whether the values taken by these variables in each country are above or below an endogenously determined threshold value.

We estimate a cross-section regression for an averaged sample period from 1996 to 2017. The data used to estimate the equation are derived from two different data sets. Corruption (as referring to the level of corruption) and Rule of Law are obtained from the International Country Risk Guide (ICRG) developed by PSR Group, while Control of Corruption and another index of Rule of Law are extracted from Worldwide Governance Indicators developed by the World Bank. The ICRG Corruption variable (Cori) assesses the level of corruption within the political system, both in form of bribes, special payments, in the form of nepotism, job reservations or secret party funding. It varies between 0 to 6, where 0 means the greatest degree of corruption. We also use the variable Control of Corruption (CCi) provided by the World Bank. This variable catches the perceptions of the citizens about how the public agents use public resources to obtain private gains. It varies between −2.5 and 2.5, where −2.5 is the worst case in which there is no control of corruption. So, in both indicators related to corruption, the lower the value of the indicator, the worse the institutional quality of the country as far as the corruption situation is referred.

The Rule of Law variable derived from ICRG evaluates the quality of the legal and judicial systems and crime rates in each country. It varies from 0 to 6 where 0 means the worst performance. Rule of Law from the World Bank assesses the quality of contract enforcement, property rights, the quality of judicial system and crime and violence. It varies from −2.5 to 2.5, where 2.5 is the best possible performance. Again, both indicators can be interpreted similarly: the lower the value of the indicator, the worse the institutional quality in terms of Rule of Law.

4. Regression tree analysis

As mentioned, the regression tree analysis splits the sample into various subsamples. As depicted in Figure 1, this splitting process generates a structure similar to that of a tree. The starting point consists of estimating the model on the whole sample; the sample is then progressively split into subsamples according to the values of one or more threshold variables, until no additional splitting is possible or required. Finally, a different model is estimated for each group of countries detected. This methodology is appropriate for our purpose as it allows us to detect different growth models throughout the entire sample. In addition, it does so by detecting thresholds endogenously, thus avoiding the ad hoc inclusion of countries in subsamples to study their growth model and the consequent misspecification biases within each group. Furthermore, it allows us to deal with outliers and, in some cases, with heteroskedasticity, since it splits the sample into homogeneous subgroups of countries. Finally, Breiman et al. (1984) demonstrate that this method is consistent, i.e. regression trees replicate the true splits when the number of observations gets large.

The first implementation of the regression tree methodology made use of the AID algorithm [Morgan and Sonquist (1963); Fielding and O’Muircheartaigh (1977)]. The CART algorithm was later developed by Breiman et al. (1984) to address some of the weaknesses of the original algorithm. However, an important drawback of CART derives from the selection bias caused by the use of a greedy algorithm. In our case, to minimize this bias, we instead make use of the GUIDE methodology (Generalized Unbiased Interaction Detection and Estimation; Loh, 2002). GUIDE addresses this problem by substituting the greedy algorithm used in CART for selecting the split variable by an LM test of linear fit. The algorithm applies this LM test for each possible threshold variable selecting the candidate with the lowest p-value.

To obtain the splitting groups, GUIDE consists of several steps:

  • First, it runs an ordinary least square regression on the whole sample and obtains the residuals.

  • Second, it creates a contingency table for each candidate for a split variable by dividing the residuals into quartiles between positive and negative values.

  • Finally, for each split variable candidate, it applies a chi-square test for linear fit (goodness-of-fit test). The split candidate with the lowest p-value (i.e. for which the rejection of linearity is strongest) is selected to split the data into two new groups of countries.

The threshold value of the split variable is the one which minimizes the joint residual sum of squared errors. This procedure is applied iteratively and the splitting process stops when the number of observations in one of the two newly created subgroups reaches a preset reasonable minimum value. The objective, when this value is selected, is to eliminate the risk of creating subgroups within which the number of observations is too low to generate reliable estimations.

To sum up, we estimate equation (1) in disjoint subsets of countries which share similar levels of one or more split variables and obtain estimations for each differing subset. This allows us to analyze how the impact of the determinants of economic growth differs from one group of countries to another and to what extent the intensity of corruption (or the quality of the Rule of Law) can affect the growth process.

Once the subgroups have been obtained, we complete the analysis by addressing the potential endogeneity problem of the institutional indicators by estimating the model in each subsample by instrumental variables (the instruments will be described in detail below). As Tan (2010) argues, the splitting process to determine homogeneous groups of countries looks for general patterns and not for causality relationships, so addressing the endogeneity problem in a second step, i.e. once the subsamples have been identified and once, we are much more interested in the coefficients themselves within each group, is a defensible strategy in a literature that is characterized by very often omitting to adequately address the endogeneity problem.

5. Empirical results

Before we present and discuss the results, the way in which we have selected and used the data requires some comments. As already explained, we use four institutional indicators are: Rule of Law and Corruption from ICRG and Rule of Law and Control of Corruption from the World Bank. We use these indicators as follows: Corruption and Control of Corruption are alternatively used as both threshold and explanatory variables, while Rule of Law indicators are used only as threshold variables, so we have four possible split variables, allowing the algorithm to determine endogenously which of them is or are the most suitable to split the sample. This use of several indicators that come from different sources and are measured with some dissimilarity allows avoiding any bias in the results derived from the choice of one of them, making the results more reliable. In addition, this way of using institutional indicators allows us to understand the impact, both direct and indirect, of corruption on economic growth without ignoring the indirect relationship between the legal and judicial framework and growth that some authors such as Berkowitz et al. (2003) and Neyapti (2013) have highlighted.

As far as model specifications are concerned, we consider, first of all, two general models that exhibit the greatest flexibility in the sense that, for each of them, the algorithm can select any of our institutional variables as a split one. Since the simultaneous presence of both corruption indicators as explanatory variables must be excluded, we will use two general models as benchmarks to evaluate their results compared with those of simpler – though reasonable – alternatives, in search of robustness for our conclusions [4]. These models are: Benchmark model with Control of Corruption, which is the model where the explanatory variable is the Control of Corruption derived from the World Bank and Benchmark model with Corruption Level, which is the model that includes Corruption itself extracted from ICRG as an explanatory variable. In each model, any of the two indicators of Corruption and Rule of Law (four institutional indicators in total) are candidates to split the sample. For robustness, we estimate four additional models that are all nested in their “benchmark” counterpart. The underlying idea is as follows: if we find a repetitive pattern in the whole set of different models with different combinations of split variables, we can conclude that our results are robust.

In Table 1, we offer a summary of the different model specifications that we are going to use to determine the subgroups if they exist.

At this stage, the fact that the explanatory variables are potentially endogenous is not a matter of serious concern since this method aims more at detecting patterns than accurately measuring causal relationships, as explained by Tan (2010). Endogeneity will be tackled at the second stage of our analysis.

Table 2 presents the regression trees for growth models. We can observe that whenever the ICRG corruption indicator is allowed to split the sample, this indicator is systematically endogenously chosen as the relevant split variable and two groups of countries are detected; the only cases where the ICRG corruption indicator is not selected correspond to Models 3 and 4, in which this indicator is exogenously excluded from the beginning. Even when we allow the algorithm to choose between ICRG and World Bank corruption indicator (both Benchmark models), the first one is always preferred. It is also worth noting that the same threshold value of the ICRG corruption indicator is estimated in all models where it is offered as a possible splitting variable. In other words, the same subgroups of countries are generated in these models. All this confers a strong robustness to our splitting results.

So, according to Table 2, corruption does affect economic growth in an indirect way by splitting the sample into different groups of countries which share the same growth model. As far as its direct influence is concerned, Tables 3 to 8 show the results of the regressions for the six different models. Each table includes the estimated coefficients using the whole sample in the second column, i.e. the estimation without subgroups and, in the following columns, the estimations for each detected subgroup. In parenthesis below the coefficients, we include the p-values of significance of the estimated coefficient. It is important to note that Model 1 and Model 2 give rise to exactly the same estimated model because the difference between them is the capacity of selecting between one or the other corruption indicator as split variables. Since in both cases, ICRG corruption is the selected split variable, the same model is fitted and, moreover, it coincides with Benchmark model with Corruption Level.

The first thing that should be highlighted in our results is that the significance of the coefficients of the traditional determinants of growth, and their signs, in all models and all groups, are in line with the literature. In particular, their signs are the expected ones: capital accumulation fosters growth, the growth of population and the depreciation of capital erodes it and, finally, the negative sign of the initial GDP per capita confirms that countries of the sample or subsample have experienced a catching-up process in terms of per capita income.

Turning now to the coefficients of the corruption indicators, let us first remember that the World Bank indicator refers to the intensity of the control of corruption and that ICRG corruption indicator is defined in such a way that a higher score reflects less corruption in the country. So, the higher these indicators, the better the corruption situation of the country. With this in mind, Table 9 shows a summary of the sign and significance results for the coefficients of the corruption in our estimations.

First of all, the corruption indicator has no significant effect for countries with low or middle level of corruption. For the high-corruption countries, we detect a significant effect in four models out of six. However, three of these four models are exactly the same, as explained above. So, in fact, we have two different models where the corruption indicator is significant and other two models with nonsignificant effect of this indicator. According to these results, we cannot elucidate whether corruption is or not significant for economic growth in these countries. However, these results could be affected by endogeneity problems; for this reason, this point is the adequate one to move to the second stage of our analysis and estimate both Benchmark models by Instrumental Variables. This will allow us to draw valid conclusions on the direct effect of corruption on growth in the two subgroups of countries identified, as well to allow for a better analysis of the indirect impact of the corruption situation on how the traditional determinants of growth affect it.

Our strategy to unveil causal effects in Benchmark models consists of using instrumental variables of the corruption indicators and is twofold: on the one hand, we want to exploit the spatial autocorrelation in the data; on the other hand, we want to take advantage of the temporal dimension for both models. As a result, we use two instruments at the same time: the first one is the instrument developed by Gründler and Potrafke (2019), which is obtained from jack-knifed regional averages of corruption for each country. Following these authors, each continent R is divided into four disjoint regions [5] r R. The instrumental variable is calculated as follows:

where Nr is the number of countries that belong to each region r and Cj is one of two possible variables: the corruption level of country j, i.e. Corj of ICRG or the control of corruption, CCj of country j from the World Bank. Therefore, we have the same type of instrument calculated with two different indicators, one based and the World Bank variables used in the Benchmark model with Control of Corruption and another one based on the ICRG variables used in the Benchmark model with Corruption Level.

The second instrument for the corruption indicator of country i takes advantage of the availability of presample data, specifically in 1995, for the corruption indicator in the ICRG database. Presample data are not available in the World Bank database. Therefore, for each country i, the value of its ICRG corruption indicator in 1995 will be used as a second instrument both for CCj in the Benchmark model with Control of Corruption and for Cori in the Benchmark model with Control of Corruption.

The IV estimation of both models using both instruments is reported in Tables 10 and 11.

Once endogeneity is taken into account, the results are unambiguous: there is no significant direct effect of corruption on economic growth in any of the subgroups. However, the coefficients of the traditional determinants of growth differ from one subgroup to the other. So, since the splitting of the sample is generated by the corruption level of countries, the incidence of the corruption variable is not direct, but reflected in the value of the estimated coefficients. In other words, although the growth model has a standard specification for all countries, the influence of the traditional determinants of growth differs according to the corruption level of the countries. An obvious implication is that using a unique model for all countries, as done in several well-known studies, carries high risk of serious specification problems since their approach does not capture different behaviors of corruption and, consequently, ignores the indirect effects of this variable on economic growth.

Comparing the results of both benchmark models, we can also see that the coefficients of both groups are very similar to each other, which is to be expected considering that the only difference between the two is an explanatory variable that is not significant. These results demonstrate the robustness of the coefficients obtained using instrumental variables estimation with the two instruments used. However, within each benchmark model, we can see that there is a substantial difference between the Solow model for highly corrupt countries and for less corrupt countries.

For instance, the coefficients associated with an investment in the model with control of corruption (Table 10) are 0.811 and 0.653, respectively, indicating that a 1% increase in the investment ratio generates a greater impact in corrupt countries than in those with better control of corruption. This has to be interpreted in the light of the private capital stock in percentage of GDP of the two groups of countries: as an illustration, in the group of higher corruption it amounts, for example, to 109.9 in Nigeria, 86.40 in Cameroon or 53.31 in Burkina Faso in 2010, whereas in the group of lower corruption it amounts to 206.8 in Canada, 220.7 in France or 250.2 in Japan for the same year (FMI data, Investment and Capital Stock Data set, 06/15/2022 update). Given these differences, it should be no surprise that a given increase in the investment ratio has a higher impact on growth in corrupted countries than in less corrupted ones.

As another example of the difference in coefficients, the negative effect of population growth plus capital depreciation is higher in corrupted countries (−0.968 vs −0.429, see Table 10). The same occurs in the Benchmark model with Corruption Level (Table 11): while a 1% increase in investment ratio increases growth by 0.75 in countries with higher levels of corruption, this effect is 0.65 in low corruption ones. The catching-up process is also greater in high corruption countries, and the negative effect of the term that includes the depreciation of capital and population growth also affects them more negatively.

Our results are consistent with the findings of Durlauf et al. (2001) in the sense that there is parameter heterogeneity in Solow model. The method we adopt allows us to derive robust evidence that differences in the levels of corruption – which are not taken into account in traditional empirical growth models – can be major causes of the parameter heterogeneity in the Solow model. This heterogeneity would be indicating that the more corrupt a country, the farther away it stands from its steady state and, therefore, the factors determining its economic growth have a greater impact.

Table 12 shows the country composition of each group of institutional quality for Benchmark Model with Control of Corruption and with Corruption level.

In general, we can detect three big geographical areas: Europe and North America with low corruption levels, Africa and Latin America with higher levels of corruption, linked to their colonial past [Acemoglu et al. (2001); LaPorta et al. (2008)], and Asia with a great heterogeneity of corruption.

It is important to highlight that China and Russia take part of the group with higher corruption. For the case of Russia, Levin and Satarov (2000) argue that corruption has been a burden reducing growth and slowing its transition to a market economy. The case of China is very different. This country grows quickly and seems to take advantage of its level of corruption. According to Larsson (2006), this difference is explained by the fact that these countries exhibit very different comparative advantages, and because corruption is more “organized” in China than in Russia. Since our analysis does not detect a direct impact of corruption on growth, these differences seem not to be significant in terms of economic growth, although a detailed analysis at the country level might shed some additional light on this aspect. We relegate this issue for future research.

In summary, we find robust evidence that the level of corruption indirectly influences economic growth by altering the impact of growth determinants but does not have a direct impact on growth. We show that the pattern of growth is not unique and that in countries with high corruption, the traditional determinants have a greater effect on growth than in countries with less corruption, which tend to be more advanced countries. According to the results obtained, the most corrupt countries are farther away from their steady state than those that control their corruption more. All this evidence shows that corruption has an important indirect effect on the growth process of countries.

6. Concluding remarks

The main goal of this paper is to analyze the potential direct and/or indirect effect of corruption on the growth process of countries. To achieve our objective, we apply a machine learning technique, not frequently used in economics, known as regression tree analysis. We apply the algorithm to a Solow model equation augmented with corruption. The existing empirical literature on the effects of corruption on growth show different and even contradictory results due to the fact that they use methodologies with a main weakness: authors assume that all countries in the sample fit the same growth model. The methodology that we apply in this paper addresses and solves this drawback allowing for indirect effects of corruption on economic growth. The application of the algorithm used here splits the sample into different groups of countries according to their level of corruption and generates different estimations of the Solow model for each group. Moreover, we use instrumental variable estimations that address the potential endogeneity problem of institutional quality variables in growth models; the endogeneity issue had not been addressed in previous papers that used the regression true method in this field.

We obtain two key empirical findings: first, corruption has no direct impact on economic growth, but it affects growth indirectly: the final impact of corruption is, indeed, reflected in the value of the estimated coefficients of the traditional Solow’s growth model. Second, we show that countries with high corruption are farther away from their steady state than those that control corruption more. Our findings indicate that the traditional determinants of the Solow model have a greater effect on these countries.

The composition of the subgroups determined endogenously in our empirical analysis reinforces the well-established pattern more developed versus less developed countries. While more developed countries exhibit, in general, lower levels of corruption, the less developed ones show the opposite. According to our results, the higher corruption countries should take advantage of the fact that the impact of the determinants of growth are relatively higher. More concretely, efforts toward less corruption combined with higher investment ratios and a better control of population growth might contribute to foster their development.


Tree schematic

Figure 1.

Tree schematic

Model specifications

ICRG rule of law ICRG corruption level World bank rule of law World bank control of corruption
Model Split variable Explanatory variable Split variable Explanatory variable Split variable Explanatory variable Split variable Explanatory variable
Benchmark with control of corruption x x x x x
Benchmark with corruption level x x x x x
Model 1 x x x
Model 2 x x x
Model 3 x x x
Model 4 x x x

Regression trees for growth models

Model Regression tree Final groups
Benchmark with Control of Corruption ICRG Corruption level ≤ 0.916
ICRG Corruption level >0.916
Benchmark with Corruption level ICRG Corruption level ≤ 0.916
ICRG Corruption level > 0.916
Model 1 ICRG Corruption level ≤ 0.916
ICRG Corruption level > 0.916
Model 2 ICRG Corruption level ≤ 0.916
ICRG Corruption level > 0.916
Model 3 ICRG Rule of Law ≤ 1.150
ICRG Rule of Law > 1.150 World Bank Control of Corruption ≤ 0.378
ICRG Rule of Law > 1.150 World Bank Control of Corruption > 0.378
Model 4 World Bank Control of Corruption ≤ 0.269 World Bank Rule of Law ≤ 0.484
World Bank Control of Corruption ≤ 0.269 World Bank Rule of Law > 0.484
World Bank Control of Corruption > 0.269

Regression tree estimations for the benchmark with control of corruption

Determinants Estimation without
High corruption group
(Indicator ≤ 0.916)
Low corruption group
(Indicator > 0.916)
Constant −0.344 (0.360) 0.351 (0.509) −0.662 (0.269)
Ln(Invi) 0.751*** (0.000) 0.743*** (0.000) 0.649*** (0.000)
Ln(Yi0) −0.154*** (0.000) −0.209*** (0.000) −0.092** (0.021)
Ln(ni +δ + ζ) −0.583*** (0.000) −0.873*** (0.000) −0.422*** (0.002)
CCi 0.044 (0.136) 0.162 (0.127) 0.021 (0.588)
R2 0.603 0.727 0.486
N 103 45 58

*** significance at 1%, ** at 5% and * at 10%

Regression tree estimations for the benchmark model with corruption level

Determinants Estimation
without subgroups
High corruption level group
(Indicator ≤ 0.916)
Low corruption level group
(Indicator > 0.916)
Constant −0.539 (0.116) 0.683 (0.884) −0.837 (0.119)
Ln(Invi) 0.748*** (0.000) 0.736*** (0.000) 0.651*** (0.000)
Ln(Yi0) −0.139*** (0.000) −0.200*** (0.000) −0.064** (0.021)
Ln(ni +δ + ζ) −0.587*** (0.000) −0.900*** (0.000) −0.432*** (0.000)
Ln(Cori) −0.075 (0.324) 0.037** (0.035) −0.067 (0.560)
R2 0.598 0.741 0.487
N 103 45 58

*** significance at 1%, ** at 5% and * at 10%

Regression tree estimations for Model 1

Determinants Estimation without
High corruption level group
(Indicator ≤ 0.916)
Low corruption level group
(Indicator > 0.916)
Constant −0.539 (0.116) 0.683 (0.884) −0.837 (0.119)
Ln(Invi) 0.748*** (0.000) 0.736*** (0.000) 0.651*** (0.000)
Ln(Yi0) −0.139*** (0.000) −0.200*** (0.000) −0.064** (0.021)
Ln(ni +δ + ζ) −0.587*** (0.000) −0.900*** (0.000) −0.432*** (0.000)
Ln(Cori) −0.075 (0.324) 0.037** (0.035) −0.067 (0.560)
R2 0.598 0.741 0.487
N 103 45 58

*** significance at 1%, ** at 5% and * at 10%

Regression tree estimations for Model 2

Determinants Estimation
without subgroups
High corruption level group
(Indicator ≤ 0.916)
Low corruption level group
(Indicator > 0.916)
Constant −0.539 (0.116) 0.683 (0.884) −0.837 (0.119)
Ln(Invi) 0.748*** (0.000) 0.736*** (0.000) 0.651*** (0.000)
Ln(Yi0) −0.139*** (0.000) −0.200*** (0.000) −0.064** (0.021)
Ln(ni +δ + ζ) −0.587*** (0.000) −0.900*** (0.000) −0.432*** (0.000)
Ln(Cori) −0.075 (0.324) 0.037** (0.035) −0.067 (0.560)
R2 0.598 0.741 0.487
N 103 45 58

*** significance at 1%, ** at 5% and * at 10%

Regression tree estimations for Model 3

Determinants Estimation
without subgroups
Low institutional
quality group
Medium institutional
quality group
High institutional
quality group
Constant −0.344 (0.360) 0.378 (0.504) −1.292* (0.069) 1.447 (0.209)
Ln(Invi) 0.751*** (0.000) 0.718*** (0.000) 0.911*** (0.000) 0.399* (0.077)
Ln(Yi0) −0.155*** (0.000) −0.210*** (0.000) −0.108** (0.020) −0.228** (0.013)
Ln(ni + δ + ζ) −0.583*** (0.000) −1.081*** (0.000) −0.470*** (0.000) −0.113 (0.611)
Ln(Cori) 0.044 (0.137) 0.066 (0.137) −0.114 (0.369) 0.006 (0.928)
R2 0.603 0.655 0.760 0.414
N 103 40 30 33

Corruption level indicator a priori excluded as a split variable; *** significance at 1%, ** at 5% and * at 10%

Regression tree for Model 4

Determinants Estimation without
Low institutional
quality group
Medium institutional
quality group
High institutional
quality group
Constant −0.344 (0.360) 0.366 (0.493) −0.891 (0.299) −1.785* (0.069)
Ln(Invi) 0.751*** (0.000) 0.619*** (0.000) 0.968*** (0.000) 0.471** (0.017)
Ln(Yi0) −0.155*** (0.000) −0.177*** (0.000) −0.166** (0.016) −0.295*** (0.000)
Ln(ni +δ + ζ) −0.583*** (0.000) −0.896*** (0.000) −0.783*** (0.000) −0.016 (0.906)
CCi 0.044 (0.137) 0.084 (0.493) −0.168 (0.407) 0.009** (0.049)
R2 0.603 0.680 0.722 0.551
N 103 35 30 38

Corruption level indicator a priori excluded as a split variable; *** significance at 1%, ** at 5% and * at 10%

Significance and signs of estimated coefficients for corruption indicators by groups of countries

Groups with
high corruption
Groups with
middle corruption
Groups with
low corruption
Benchmark Control of Corruption Non-significant Non-significant
Benchmark Corruption Level Significant Non-significant
Model 1 Significant Non-significant
Model 2 Significant Non-significant
Model 3 Non-significant Non-significant Non-significant
Model 4 Significant Non-significant Non-significant

*** significance at 1%, ** at 5% and * at 10%

Instrumental variables estimation of benchmark model with control of corruption

Determinants High corruption level group
(Indicator ≤ 0.916)
Low corruption level group
(indicator > 0.916)
Constant 0.061 (0.947) −0.938 (0.198)
Ln(Invi) 0.811*** (0.000) 0.653*** (0.000)
Ln(Yi0) −0.216*** (0.000) −0.061 (0.293)
Ln(ni +δ+e) −0.968*** (0.000) −0.429*** (0.000)
Ln(CCi) −0.066 (0.870) −0.019 (0.759)
R2 0.682 0.474
N 45 58

*** significance at 1%, ** at 5% and * at 10%

Instrumental variables estimation of benchmark model with corruption level

Determinants High corruption level group
(Indicator ≤ 0.916)
Low corruption level group
(indicator > 0.916)
Constant 0.122 (0.829) −0.815 (0.141)
Ln(invi) 0.750*** (0.000) 0.651*** (0.000)
Ln(Yi0) −0.217*** (0.000) −0.068* (0.071)
Ln(ni+δ + ε) −0.950*** (0.000) −0.427*** (0.000)
Ln(Cori) 0.246 (0.804) −0.057 (0.729)
R2 0.719 0.485
N 45 58

*** significance at 1%, ** at 5% and * at 10%

Groups of countries for both benchmark models

High corruption group Low corruption group
Algeria, Angola, Argentina, Albania, Armenia, Belarus, Bangladesh, Bolivia, Burkina Faso, Cameroon, China, Rep. Congo, Cote d’Ivoire, Dominican Republic, Egypt, Gabon, Ghana, Guatemala, Guinea-Bissau, Honduras, Indonesia, India, Jamaica, Kenya, Latvia, Mali, Malawi, Mexico, Mozambique, Niger, Nigeria, Pakistan, Papua New Guinea, Panama, Paraguay, Philippines, Russia, Sierra Leone, Togo, Thailand, Tunisia, Turkey, Uganda, Ukrania, Zimbabwe Australia, Austria, Bahrain, Belgium, Botswana, Brazil, Bulgaria, Canada, Chile, Colombia, Costa Rica, Croatia, Czech Republic, Denmark, Estonia, Ecuador, El Salvador, Finland, France, Gambia, Germany, Greece, Hong Kong, Hungary, Iceland, Ireland, Israel, Italy, Japan, Jordan, Korea, Rep., Kuwait, Lithuania, Luxembourg, Malaysia, Madagascar, Morocco, Namibia, Netherlands, Nicaragua, New Zealand, Norway, Peru, Poland, Portugal, Romania, Singapore, Senegal, Slovak Republic, Slovenia, Spain, Sri Lanka, Sweden, Switzerland, UK, USA, Uruguay, Zambia



Following Berkowitz et al. (2003) and Neyapti (2013), we do not consider the rule of law indicator as a potential explanatory variable. They, indeed, highlight that its effect is only indirect.


Previous papers also addressed nonlinearity through regime-dependent models, with different determinants of the regime changes. Meon and Sekkat (2005) find that the negative impact of corruption decreases with the quality of governments. Aidt et al. (2008) and Méon and Weill (2010) conclude that corruption has regime-specific effects on growth with weaker impact on countries with poorer institutional quality.


This indirect effect is also documented in Gwartley et al. (2006), Minier (2007) and Dort et al. (2014), who obtain that the quality of institutions has an impact on the marginal effect of investment on growth.


In this aspect, we follow the strategy recommended by Tan (2010).


The classifications of regions come from Gründler and Krieger (2016).


Acemoglu, D., Naidu, S., Restrepo, P. and Robinson, J.A. (2019), “Democracy does cause growth”, Journal of Political Economy, Vol. 127 No. 1, pp. 47-100.

Acemoglu, D., Johnson, S. and Robinson, J.A. (2001), “The colonial origins of comparative development: an empirical investigation”, American Economic Review, Vol. 91 No. 5, pp. 1369-1401.

Ahmad, E., Ullah, M.A. and Arfeen, M.I. (2012), “Does corruption affect economic growth?”, Latin American Journal of Economics, Vol. 49 No. 2, pp. 277-305, doi: 10.7764/LAJE.49.2.277.

Aidt, T.S. (2009), “Corruption, institutions, and economic development”, Oxford Review of Economic Policy, Vol. 25 No. 2, pp. 271-291.

Aidt, T., Dutta, J. and Sena, V. (2008), “Governance regimes, corruption and growth: theory and evidence”, Journal of Comparative Economics, Vol. 36 No. 2, pp. 195-220.

Alcala, F. and Ciccone, A. (2004), “Trade and productivity”, The Quarterly Journal of Economics, Vol. 119 No. 2, pp. 613-646.

Barro, R.J. and Sala-I-Martin, X. (1992), “Convergence”, Journal of Political Economy, Vol. 100 No. 2, pp. 223-251.

Berkowitz, D., Pistor, K. and Richard, J.F. (2003), “The transplant effect”, The American Journal of Comparative Law, Vol. 51 No. 1, p. 163.

Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J. (1984), Classification and Regression Trees, Wadsworth and Brooks, Monterey, CA.

Campos, N.F., Dimova, R.D. and Saleh, A. (2010), “Whither corruption? A quantitative survey of the literature on corruption and growth”, CEPR Discussion Paper No. DP8140, available at SSRN:

Cerqueti, R., Coppier, R. and Piga, G. (2012), “Corruption, growth and ethnic fractionalization: a theoretical model”, Journal of Economics, Vol. 106 No. 2, pp. 153-181.

de Vaal, A. and Ebben, W. (2011), “Institutions and the relation between corruption and economic growth”, Review of Development Economics, Vol. 15 No. 1, pp. 108-123, doi: 10.1111/j.1467-9361.2010.00596.x.

Dollar, D. and Kraay, A. (2002), “Growth is good for the poor”, Journal of Economic Growth, Vol. 7 No. 3, pp. 195-225.

Dollar, D. and Kraay, A. (2003), “Institutions, trade, and growth”, Journal of Monetary Economics, Vol. 50 No. 1, pp. 133-162.

Dort, T., Méon, P.G. and Sekkat, K. (2014), “Does investment spur growth everywhere? Not where institutions are weak”, Kyklos, Vol. 67 No. 4, pp. 482-505.

Durlauf, S.N., Kourtellos, A. and Minkin, A. (2001), “The local Solow growth model”, European Economic Review, Vol. 45 No. 4-6, pp. 928-940.

Fielding, A. and O'Muircheartaigh, C.A. (1977), “Binary segmentation in survey analysis with particular reference to AID”, Journal of the Royal Statistical Society: Series D (the Statistician), Vol. 26 No. 1, pp. 17-28.

Glaeser, E.L., La Porta, R., Lopez-de-Silanes, F. and Shleifer, A. (2004), “Do institutions cause growth?”, Journal of Economic Growth, Vol. 9 No. 3, p. 271303.

Gründler, K. and Krieger, T. (2016), “Democracy and growth: Evidence from a machine learning indicator”, European Journal of Political Economy, Vol. 45, pp. 85-107.

Gründler, K. and Potrafke, N. (2019), “Corruption and economic growth: New empirical evidence”, European Journal of Political Economy, Vol. 60, p. 101810.

Gwartney, J.D., Holcombe, R.G. and Lawson, R.A. (2006), “Institutions and the impact of investment on growth”, Kyklos, Vol. 59 No. 2, pp. 255-273.

LaPorta, R., Lopez-de-Silanes, F. and Shleifer, A. (2008), “The economic consequences of legal origins”, Journal of Economic Literature, Vol. 46 No. 2, pp. 285-332.

Larsson, T. (2006), “Reform, corruption, and growth: Why corruption is more devastating in Russia than in China”, Communist and Post-Communist Studies, Vol. 39 No. 2, pp. 265-281.

Leff, N.H. (1964), “Economic development through bureaucratic corruption”, American Behavioral Scientist, 8: 8-14. Reprint, in Heidenheimer, A.J., Johnston, M. and LeVine, V.T. (Eds), Political Corruption: A Handbook, 389-403, Transaction books, 1989. Oxford.

Levin, M. and Satarov, G. (2000), “Corruption and institutions in Russia”, European Journal of Political Economy, Vol. 16 No. 1, pp. 113-132.

Loh, W.Y. (2002), “Regression tress with unbiased variable selection and interaction detection”, Statistica Sinica, pp. 361-386.

Mankiw, N.G., Romer, D. and Weil, D.N. (1992), “A contribution to the empirics of economic growth”, The Quarterly Journal of Economics, Vol. 107 No. 2, pp. 407-437.

Méndez, F. and Sepúlveda, F. (2006), “Corruption, growth and political regimes: cross country evidence”, European Journal of Political Economy, Vol. 22 No. 1, pp. 82-98.

Méon, P.G. and Sekkat, K. (2005), “Does corruption grease or sand the wheels of growth?”, Public Choice, Vol. 122 No. 1-2, pp. 69-97.

Méon, P.G. and Weill, L. (2010), “Is corruption an efficient grease?”, World Development, Vol. 38 No. 3, pp. 244-259.

Minier, J. (2007), “Nonlinearities and robustness in growth regressions”, American Economic Review, Vol. 97 No. 2, pp. 388-392.

Morgan, J.N. and Sonquist, J.A. (1963), “Problems in the analysis of survey data, and a proposal”, Journal of the American Statistical Association, Vol. 58 No. 302, pp. 415-434.

Neyapti, B. (2013), “Modeling institutional evolution”, Economic Systems, Elsevier, Vol. 37 No. 1, pp. 1-16.DOI, doi: 10.1016/j.ecosys.2012.05.004.

North (1990), Institutions, Institutional Change and Economic Performance, Cambridge Univ. Press, D.C.

Rigobon, R. and Rodrik, D. (2005), “Rule of law, democracy, openness, and income”, The Economics of Transition, Vol. 13 No. 3, pp. 533-564.

Rodrik, D., Subramanian, A. and Trebbi, F. (2004), “Institutions rule: the primacy of institutions over geography and integration in economic development”, Journal of Economic Growth, Vol. 9 No. 2, pp. 131-165.

Saha, S. and Gounder, R. (2013), “Corruption and economic development nexus: variations across income levels in a non-linear framework”, Economic Modelling, Vol. 31, pp. 82-89, doi: 10.1016/j.econmod.2012.11.012.

Saha, S., Mallik, G. and Vortelinos, D. (2017), “Does corruption facilitate growth? A cross-national study in a non-linear framework”, South Asian Journal of Macroeconomics and Public Finance, Vol. 6 No. 2, pp. 178-193.

Sharma, C. and Mitra, A. (2019), “Corruption and economic growth: Some new empirical evidence from a global sample”, Journal of International Development, Vol. 31 No. 8, pp. 691-719.

Swaleheen, M. (2011), “Economic growth with endogenous corruption: an empirical study”, Public Choice, Vol. 146 Nos 1/2, pp. 23-41.

Tan, C.M. (2010), “No one true path: uncovering the interplay between geography, institutions, and fractionalization in economic development”, Journal of Applied Econometrics, Vol. 25 No. 7, pp. 1100-1127.

Ugur, M. (2014), “Corruption's direct effects on per‐capita income growth: a meta‐analysis”, Journal of Economic Surveys, Vol. 28 No. 3, pp. 472-490.

Corresponding author

Laura Lopez-Gomez can be contacted at:

Related articles