Inequality in tax evasion: the case of the Spanish income tax

Sara Torregrosa-Hetland (Department of Economic History, School of Economics and Management, Lund University, Lund, Sweden)

Applied Economic Analysis

ISSN: 2632-7627

Article publication date: 11 March 2020

Issue publication date: 30 July 2020



The purpose of this paper is to estimate tax evasion and its impact on progressivity, redistribution and the measurement of inequality, using microdata from the Spanish income tax for 2001-2004.


The approach follows Feldman and Slemrod (2007) by exploiting the relation of charitable donations with the composition of income but introduces two methodological innovations, which could be useful for further studies: correction for sample selection with a Heckman two-step setting and the calculation of different evasion rates for top incomes with an interaction term.


Evasion in capital incomes was significant throughout these years. Financial incomes were reported at around 50-70 per cent of their real value, with the lowest estimates corresponding to the top decile. Revenues from fixed capital display similarly low compliance rates for the top 10 per cent. Tax evasion in self-employment incomes (direct assessment) is estimated at 20 per cent for 2001. Mostly because of a composition effect, this means that fraud was higher at the top of the income distribution, thus having a regressive impact. Inequality statistics and top income concentration estimates should, therefore, be revised upwards.


This is the first paper to estimate the distributive impacts of tax evasion in Spain, and one of very few internationally.



Torregrosa-Hetland, S. (2020), "Inequality in tax evasion: the case of the Spanish income tax", Applied Economic Analysis, Vol. 28 No. 83, pp. 89-109.



Emerald Publishing Limited

Copyright © 2020, Sara Torregrosa Hetland.


Published in Applied Economic Analysis. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at

1. Introduction

Income inequality is at the centre of academic and societal debate nowadays, even more since the Great Recession, and today we know much more about its levels and dynamics than a few decades ago. However, measurement is still complicated, particularly because of the difficulties incorrectly including the extremes of the distribution. One challenge, which still remains unresolved arises from using tax data in contexts where evasion is or was prevalent.

Income tax statistics are indeed one of the main sources for the study of inequality; particularly for the period before the availability of representative household surveys (Atkinson and Piketty, 2007). To depict the income distribution correctly, these taxes should be general contributions (affecting everyone over a relatively low monetary threshold) and rest on a comprehensive income definition (considering all incomes from different economic activities). Real-world personal income taxes became general and comprehensive over time, and in that process, they also became one of the main funding sources of the state in developed countries: since 1965, they have represented an average of 9 per cent of GDP in the Organisation for Economic Co-operation and Development (OECD) countries, and around a third of public revenue[1].

However, in the presence of fraud and base erosion, many incomes might not be accurately assessed and taxed. They may be hidden from the authorities, exempted because of specific regulations or assessed well under their real value. If this affects a significant portion of income, and with a non-uniform distribution across the population, inequality might be quite different from what it appears like in official statistics.

Tax evasion itself has been the object of an emerging strand of literature in recent years, much of which has a relation to inequality. Classic microeconomic models generally point towards lower compliance in high income taxpayers because of their higher potential tax savings in front of increasing marginal rates (Andreoni et al., 1998). Recent discussion moved the emphasis to the capacity to evade, which, in turn, depends on the composition of income: because of withholding at source and third-party reporting, some incomes are better controlled by the tax administration, and taxpayers receiving those are more constrained in their behaviour (Kleven et al., 2011). In most historical and present contexts, this means wage earners being less able to evade when compared to the self-employed or recipients of capital incomes.

Both approaches point towards wealthy taxpayers potentially under-reporting their incomes to a greater extent – ultimately meaning that tax evasion reduces progressivity. However, the question is still largely unanswered by the empirical literature. A notable exception is Alstadsæter et al. (2019), who study Scandinavian countries in recent years using, among other data, leaks from tax havens. This paper makes a novel contribution by introducing another case study and method, which might be applied to further contexts. It estimates fraud in the Spanish personal income tax in the early 2000s, calculating evasion rates for each revenue source, and also for different income levels. Then, it approximates the effects on tax progressivity and the measurement of inequality.

The method proposes several innovations on Feldman and Slemrod (2007)’s strategy to estimate evasion by income source. Their model was based on the relationship between reported incomes and deducted donations, and applied in tax return data from the USA. However, in countries where making donations is not very widespread (such as not only Spain but also others), this could result in a sample selection bias. A two-step Heckman procedure is, therefore, used to correct it. This novel approach could facilitate the analysis of fraud in other historical or present contexts with limited data – specifically where there is no access to tax inspection results. The paper also introduces an interaction term with the total income level, to establish whether under-reporting behaviour varies across the income scale. As will be seen, the findings are consistent with those obtained by Alstadsæter et al. (2019) for Scandinavia. However, as the kind of detailed data they use are not frequently available, it is convenient to corroborate the results from different angles, and to check what the differences between countries might be. Spain usually scores high in studies about tax evasion, also particularly in the personal income tax (Esteller, 2011; Domínguez-Barrero et al., 2015). However, it does not stand out in an international comparison when the level of development or the economic structure are taken into account. The conclusions are, therefore, of wider interest.

The paper talks to different strands of economic literature. On the one hand, to the above-mentioned discussion about tax evasion: do wealthy taxpayers or capitalists, hide their incomes to a greater extent? How progressive are real-life income taxes? Next, it contributes to the debate about the levels and dynamics of inequality. How much will their account be affected by considering hidden incomes? Top income estimates based on tax data might have to be adjusted upwards, potentially nuancing the great equalization of the twentieth century and the more recent surge in inequality or international comparisons. In Spain as in other countries, tax data has been used to assess inequality; by e.g. Onrubia Fernández and Picos Sánchez (2013) for the general distribution and by Alvaredo and Saez (2009) for top income concentration[2].

The results show considerable evasion in capital incomes, particularly those from movable sources. In several instances, higher evasion rates are found at the top of the income distribution. Because of this, and especially because of a composition effect, fraud was higher at the top. The calculations reveal and quantify a negative impact of fraud on progressivity, redistribution, and inequality. Top income shares would need to be upward adjusted by 2-3.5 per cent points, casting doubt on the accuracy of trends in inequality based on tax data for periods when evasion is known or expected to be high.

2. Theoretical framework and previous studies

The first tax evasion models considered the individual decision of a taxpayer in face of risk. How much of her income would it be optimal to report to the tax authorities? In Allingham and Sandmo (1972), the individual seeks to minimize her tax bill, taking into account the possibility of being caught if evading and the heaviness of potential sanctions. Incentives to evade would-be stronger at top income levels, and for taxpayers facing higher marginal tax rates. The extensions of this literature have abounded on the relation between marginal rates, the income level and evasion, with conclusions about the impact of tax rates depending on the specific assumptions about risk aversion and the design of sanctions. These early “deterrence” models were shown to predict much higher levels of fraud than found in reality. Therefore, further work paid attention to other possible determinants of the reporting behaviour, such as tax morale (Luttmer and Singhal, 2014) or the role of withholding at source and third-party reporting (Kleven et al., 2011). Regarding the latter, when some of the taxpayer’s income is already known by the tax administration it is not a completely free decision whether to report it or not. Then, as is well-known, some kinds of revenue are, or have historically been, subject to better control than others; namely, income from dependent labour versus that from self-employment or capital. These have all very distinct distributions over the income scale. Based on this, the hypothesis of the paper is that income concealment was higher at the top of the taxpayer distribution[3].

Empirical studies on taxpayer behaviour started to point in the 1980s towards a positive association of income levels and fraud, even if with considerable econometric uncertainty (Clotfelter, 1983; Feinstein, 1991; Valdés, 1982; Raymond-Barà, 1987). Work addressed directly to study the distribution of evasion has also suggested that it would increase with income. Using randomly audited tax returns from the USA, Johns and Slemrod (2010) found fraud to reach maximum levels in the top percentiles; this was partially a result of the composition of incomes, but not exclusively. Bishop et al. (2000) pointed in the same direction for the 1980s, although their focus was on horizontal equity. Similar conclusions were obtained by Feldman and Slemrod (2007), who estimated under-reporting with un-audited data and found it to increase with income levels for capital and self-employment non-farm revenues. Other analyses, with emphasis on the distributional effects of fraud, are Alm et al. (1991), Matsaganis and Flevotomou (2010) or Benedek and Lelkes (2011)[4].

Recent studies undertaken for Scandinavian countries have shown income sheltering of different kinds to be concentrated at the top. Alstadsæter et al. (2016) deal with retained business earnings, which do not make it to individual tax returns: in a similar spirit as this paper, their calculations for Norway show that top income shares are underestimated when this is not considered. Alstadsæter and Jacob (2016, 2017) evidence how different tax minimization strategies are used mostly by high-income individuals (in Sweden). Finally, Alstadsæter et al. (2019) have demonstrated tax evasion to be strongly increasing with wealth levels in Norway, Sweden and Denmark. These studies are performed using very detailed microdata: tax registers on firms and individuals, random audits and leaked files from two large offshore financial institutions in Switzerland and Panama. Such high quality information is not always available for other countries or for historical periods, which is why an alternative is explored here.

Consistent with the previous literature, the hypothesis in this paper is that evasion and base erosion were more prevalent at the top of the taxpayer distribution because of a higher capacity to hide non-labour incomes (given withholding of wages and salaries). Therefore, the tax would have been less progressive in its operation than according to the regulations – as has been suggested by many, e.g. in Freire-Serén and Panadés (2008). The corollary is that real incomes would be more unequally distributed than reported incomes, and inequality indices would need to be corrected upwards – particularly, the estimates of income concentration at top percentiles.

3. Data and methodology

3.1 Data and concepts

We use the tax microdata from the Spanish Instituto de Estudios Fiscales (IEF), which is representative for the whole territory except for the Basque Country and Navarre (because of their specific fiscal arrangements). These files contain rich and detailed information. They provide us with the main monetary variables in the tax returns filed each year: incomes from different sources, deductions made for costs, other allowances and credits, total tax base, resulting in tax due, etc. They also include the municipality of residence, gender and age. Some further information, like the number of children or dependent parents, can be inferred from deductions claimed.

The paper uses four yearly, samples from the period 2001-2004, which contain near 400,000 observations each[5]. 1999-2000 are not included because age is missing in many observations. An attempt to use the files from the 1990s was precluded by the unavailability or unreliability of donations data while earlier work has already covered more recent years (Domínguez-Barrero et al., 2015, 2017).

The tax data distinguishes five main types of incomes, arising from labour, movable capital, fixed capital, self-employment and other sources. In the original Spanish fiscal legislation, these were all integrated with the tax base (meaning that negative incomes of any kind would compensate revenues from any other source). Labour incomes include not only wages and salaries but also pensions. Included in movable capital are interest and dividend income, while fixed capital is composed of imputed incomes from home-ownership and revenues from renting real estate. Within self-employment income, we can distinguish those assessed with the direct method (i.e. accountancy-based) and those assessed using presumptive standards (under a certain revenue threshold). “Other” incomes were mostly realized capital gains. By 2001, those originated in long-term investment were named “special taxable base” and granted privileged fiscal treatment, with a single rate instead of the progressive schedule applied to the rest of income.

Some conceptual clarifications may be needed. The paper estimates “fraud and base erosion” as two ways for income to escape taxation. Fraud is, of course, illegal behaviour, while base erosion would not be (however vague the distinction often is). For the purposes here, what is important is that both are equivalent in reducing the tax burden on their recipients and hiding certain incomes from tax statistics. Both fraud and base erosion limit the capacity of the tax to fulfil the principles of generality and a comprehensive income definition, and therefore, our ability to gauge real economic inequality with tax data.

The concept of fraud is made equivalent here to income under-reporting, and therefore, neither include non-filing nor abuse of tax credits[6]. Base erosion is used as equivalent to legal income under-assessment: the result of regulations that ascribe to certain flows a tax value below their economic value (which can be the result of political pressure)[7]. Both under-reporting and under-assessment are estimated together, and cannot be disentangled with the available data.

3.2 Methodology

3.2.1 The Feldman–Slemrod strategy: too generous to be true?

Feldman and Slemrod (2007) devised a method to estimate tax evasion by income source using taxpayers’ returns (instead of results from tax inspection). The calculation is based on Pissarides and Weber (1989)’s insight about relative under-reporting in household surveys: the self-employed were shown to be untruthful reporters of their income because of their seemingly higher expenditure in food relative to wage-earners (which were taken as reliable). In Feldman and Slemrod’s elaboration, the truthful category is no longer a type of individual, but an income source – labour; in turn, the “consumption” item is the amount of charitable donations, also assumed to be truthfully reported. We may think of many determinants of the income share that an individual would wish to donate, but this decision should likely not be influenced by whether her income was obtained as wages, self-employment revenues or interest. If we accept Feldman and Slemrod’s assumptions, we can estimate an equation of the following form:

(1) lndoni=α +βlnLi+ j=15kjYi+k6Ni+ γXi+ui

Where doni are the donations made in the year by taxpayer i, and income appears decomposed by source: L from labour and Y all other types of positive income (the five categories are: movable capital, fixed capital, self-employment under direct assessment, self-employment under standard assessment and other incomes – mostly irregular flows, like capital gains). N represents negative incomes of all kinds. All revenues are defined as broadly as possible from the data (i.e. they are meant to represent the total yield, net of costs of obtainment but not of other tax allowances)[8].

Xi is a vector of control variables, which in our case includes age, gender, marital status, number of dependents, type of tax return and differential tax due before the deduction for donations, investment in housing, inequality, city size and regional dummies[9]. Notice that, in contrast to Feldman and Slemrod (2007) and much of the related literature, we do not include a variable representing the “price” of donating: this is because in Spain charitable contributions are treated as a tax credit (a given percentage of the donation is deducted from the tax bill), and not as a reduction from the taxable base, which implies that they are not affected by different marginal tax rates[10]. ui is the error term.

The coefficients of interest for tax evasion are the ks: if they are significantly higher than one, 1/k will indicate the compliance ratio of each component (reported income/real income). Labour income is taken as the reference with better reporting, and therefore, has no corresponding k: all results should thus be interpreted as relative to the compliance in labour. Wages and salaries are expected to be better reported because of withholding at source, but hiding these incomes is certainly possible, with the collusion of the employer by paying part of salaries under the counter or as dividend income. It should be noted that our results in terms of differences between income sources do not crucially depend on the level of compliance in labour: if we assume wages to be reported at 100 per cent of their real amount, a k = 2 for movable capital, for example, would mean a compliance ratio of 50 per cent in these incomes, but if compliance in wages was set at 80 per cent we can adjust estimated compliance in capital to 0.5 × 0.8 = 40 per cent[11].

Some potential issues in the model are discussed next. First, taxpayers might over-report their donations to obtain an excessive tax credit; for example, Slemrod (1989) found an average overestimation of 7.2 per cent in audited returns from the USA. In Spain, third-party reporting in this deduction was introduced in 1999, which would support the reliability of the donated quantities reported, even if we cannot check the aggregates with independent statistics because they do not exist[12]. Secondly, non-realized capital gains might affect donations (as they are part of a Haig–Simons income concept) but are not reported in the tax data. If they are correlated with certain income types, the corresponding coefficients might be affected (e.g. capital). Similar biases might arise if non-deductible costs associated with labour are important (e.g. commuting expenses), which reduce the disposable incomes on the basis of which donating decisions are made. Finally, if donating decisions are done considering “permanent income” as well, the under-reporting coefficients obtained with current income might be upwards biased (Engström and Hagen, 2017)[13].

3.2.2 Controlling for sample selection.

In Spain, charitable donations are less extended than in the USA, where the Feldman–Slemrod method was initially applied. Table I shows that returns with donations (s = 1) were around 13-16 per cent in the early 2000s and that the mean income of taxpayers claiming this deduction was significantly higher than that of the rest. This points towards a likely problem of sample selection bias if we ran equation (1) directly, using only the observations with donations (s = 1).

The problem can be tackled using a two-stage Heckman estimation[14]. In this case, a probit equation is run first, to explain the “donating or not” behaviour:

(2) Probsi=1ln Incomei, Zi= Φ (α+ βln Incomei+ γZi)
s = 1 meaning that the taxpayer made a deductible donation during the year. Φ is the normal cumulative function. Income is defined as the sum of net revenues from the different sources in equation (1). Zi is a vector of taxpayer characteristics, which includes all those in Xi but also some extra variable(s) expected to affect the yes/no decision, but not the amount donated (“exclusion restriction”). For this, a wealth dummy is used, which indicates whether there are capital gains in the return. This follows from the rationale that status considerations associated with wealth influence the “donating or not” decision but should not impact on the quantities donated, once controlling for income[15].

After estimating the probit equation, the inverse Mills ratio (λ) can be calculated, which accounts for the probability of selection of each observation:

(3) λi= ϕ (α+ β^ln Incomei+ γ^Zi)Φ (α+ β^ln Incomei+ γ^Zi)
where φ and Φ are the normal density function and normal cumulative function of the predicted values in the probit estimation. This new variable λ is included in the final regression, to correct the selection bias (here, we only use the observations where s = 1):
(4) lndoni=α +βlnLi+j=15kjYi+k6Ni+γXi+ λi+ ui

3.2.3 Differences across income levels.

Using equation (4), we obtain a baseline result for each income type. However, there can be reasonable doubt that under-reporting is constant across income levels, varying only for different sources of revenue. This is tested by introducing an interaction term with a dummy, which takes value 1 for observations in the top decile of total income. The equation now looks like:

(5) lndoni=α +βlnLi+j=15(kjYi+tjYi*top10i)+k6Ni+γXi+ λi+ ui
with X incorporating the dummy top10 among the other controls. As before, k indicates fraud if it is significantly higher than 1, and similarly, for the top decile, the relevant test is k + t > 1. Because the estimation of different rates of evasion could be affected if there were significant differences in the income elasticity of donations, an interaction of income with the top10 dummy is also added[16].

4. Results

4.1 Income concealment by source

Table II presents the estimates. The first column in each year contains the baseline results [Equation (4)], while the second shows the preferred specification, which allows for differences by income level in both the income elasticity of donations and under-reporting [Equation (5)]. We report the coefficients for income, lambda and the variables of interest k and t. As can be seen, lambda is generally found to be significant, which points towards a risk of incurring in biased estimates if we had not accounted for selection[17]. The income elasticity of donations is in the 0.4-0.6 range, and estimates for the top 10 deciles show it might go downwards for higher incomes, but without attaining statistical significance[18].

The k coefficients indicate the presence of under-reporting when they are significantly higher than one. The main interest here lies in movable capital, fixed capital and self-employment, as other incomes are quantitatively much less important. The models point towards significant under-reporting in both types of capital incomes, but not for self-employment except in 2001. Within these, a distinction is made according to the assessment procedure: “direct” corresponds to accountancy-based estimation, while “standard” (presumptive) assessments are used under a certain threshold. This, however, does not lead to significant results for the presumptive categories either (again, except for 2001 in the top decile).

Capital incomes are generally more under-reported at the top decile than within the bottom 90 per cent (which can be seen from the positive coefficients on the interaction terms with income types). For fixed capital, once controlling for income levels no under-reporting at the bottom can be discerned, except in 2003. There is some evidence of top decile taxpayers under-reporting more within self-employment, but only in 2001. On the other hand, “other incomes” appear more accurately reported at the top in 2001-2002 (where evasion is only detected for the bottom 90 per cent). This is an unexpected result, which could point towards the fact that capital gains, as an irregular income generated over several years, do not have the same kind of relations with donations than other incomes. For this reason, a Feldman–Slemrod like a procedure to devise tax evasion might not be accurate for capital gains.

Table III displays the compliance ratios estimated and applied for each source of income, following the results from Table II. The incomes of the bottom 90 per cent are adjusted using the k coefficients, and those o the top decile using k + t (in both cases, where this result is significantly higher than 1). Compliance levels, when found to be significantly different from that of labour, lie generally between 50 and 80 per cent. They are often lower at the top for each income source, except, as mentioned earlier, in “other incomes”. In any case, the differences found between income sources, consistent with the initial hypothesis, motivate further calculations about the distributive implications of fraud in Section 4.2. For several income source-level-year combinations, no statistically significant results were obtained, and thus, no compliance ratio is reported in Table III. It should be noted, however, that some of these coefficients in Table II are clearly over 1 (e.g. movable capital incomes and “other incomes” for the bottom 90 per cent in 2004), but do not attain statistical significance because of the high standard errors in the estimation. The correction of incomes performed in the next subsection should, therefore, be considered a conservative one.

If we compare the obtained compliance ratios from Table III with available estimates for the early 1980s (Torregrosa-Hetland, 2015), we can conclude that income reporting became significantly more accurate, even if it still remained low for movable capital in the 2000s. On the other hand, the behaviour of the self-employed in many instances cannot be statistically distinguished from total compliance might be shocking for readers familiar with the Spanish context. Possible explanations are the relatively low number of observations in standard assessment, and that the compliance level in these categories could be similar to that of labour, which is always the reference in our calculations[19].

For 2008, Domínguez-Barrero et al. (2015) calculated a compliance ratio of 60 per cent in movable capital, 70 per cent in fixed capital and 65 per cent in self-employment under direct estimation (78 per cent in presumptive assessment). This would indicate significant stability in capital incomes during the 2000s, and maybe some deterioration in self-employment – but this last appreciation could be because of methodological differences.

Turning to international comparison, several studies provide estimates of evasion in self-employment incomes fluctuating around 30 per cent. Similarly, high rates of under-reporting have been found, for example, among small informal business suppliers in the USA (Black et al., 2012), in self-reported incomes in Denmark (Kleven et al., 2011) or in self-employment incomes in Italy (Albarea et al., 2015). The 88 per cent compliance rate obtained here for 2001 (baseline equation) is situated in the upper range of available estimates, signalling again that it is an upper bound (the results of these other studies are not relative to labour’s compliance)[20].

Concerning capital incomes, the Swedish National Tax Agency (2008) provides a figure of 39 per cent of evasion in tax due from capital incomes (the majority of which would be international fraud). A related estimate is Zucman (2014)’s calculation that around 10 per cent of European financial wealth was held offshore in 2013 – of which 80 per cent would go unreported. This paper’s result of fraud increasing with income is consistent with the recent insights of Alstadsæter et al. (2019), who find that evasion of income and wealth taxes, in terms of tax due, increases with wealth. Some improved information and control over these incomes have potentially been counteracted by capital mobility, avoidance schemes and the development of tax privileges, which can be seen as a white-collar substitute for outright evasion.

4.2 Impact on inequality

Evasion was high especially in capital incomes, which tend to be concentrated at the top of the distribution, and also in several cases higher at the top decile within an income type. As a consequence, estimates of global compliance are decreasing with respect to total income. For example, in 2001 those in deciles 1-5 reported 97 per cent of their revenues, while the ratio decreased to 89 per cent in the top decile, and 80 for the top 1 per cent.

This distributional pattern naturally has an impact on progressivity and inequality, which Table IV illustrates. Scenario 1, “original”, is the combination of reported incomes and actual tax payments. This is how the tax seems to be working, according to the official statistics – but in the presence of fraud, these indicators are a miscalculation of real progressivity. In Scenarios 2 and 3, “real” incomes are used, obtained factoring up the reported revenues of each type with the results of Table II. The “corrected” Column 2 represents the effective behaviour of the tax, with “real” incomes in combination with actual tax payments from the original data (which derive from reporting decisions). Finally, in a “no fraud” scenario, the distribution of the tax burden, and thus, the reduction in inequality would have been different (Column 3). This is a simulation of the tax due that would correspond to each observation’s estimated real income, applying the tax schedule in force[21].

The initial expectations are confirmed by this exercise. Inequality was higher than it looked like in the official data, both before and after tax: for example, 4 per cent higher for post-tax inequality in 2003-2004 according to the Gini index. Moreover, fraud reduced the revenue capacity of the tax: the average effective tax rate was between 2 and 7 per cent lower (the difference between the columns “original” and “corrected”). Redistribution estimates are the most affected: the corresponding index would be 16 per cent lower than apparent in 2001, between 6 and 7 per cent in the following years. This shows a clearly regressive impact of evasion, and, therefore, a biased depiction of the tax, and of income inequality, in the official statistics.

Without evasion, the personal income tax would have behaved in a notably different way. The “no fraud” column shows that, as expected from the progressive rate schedule, the taxation of high incomes would have been much more intense and the redistributive effect much stronger. In 2001 redistribution would have been 33 per cent higher (i.e. 1.5 Gini points) if income sheltering were eradicated.[22] Similarly, in 2001 the original data depicted the top 1 per cent of taxpayers as contributing with slightly over a third of their income (34 per cent). Taking into account hidden incomes, their payments would actually correspond to a considerably lower 27 per cent […] but the schedule was designed to impose on their real incomes a much higher burden of 37 per cent! Results vary in magnitude across the years but show the same kind of differences between scenarios.

What will be the impact of correcting for tax evasion in the top income shares? Table V shows estimates for the top 10 and top 1 per cent based on our upwards adjustment of incomes. They are not precise representations of the top income shares but illustrate probable orders of magnitude of the issue. We compare with the existing long-run series of income concentration in Spain provided by Alvaredo and Saez (2009). Fraud-adjusted data show higher inequality at the top: the share of the top 1 per cent is increased by an average of 2 per cent points – a lot in relative terms – and appears clearly above 10 per cent of total income. For the top decile, shares increase by around 3.6 per cent points and are situated between 36 and 40 per cent.

If tax evasion behaves regressively in other countries as well (as suggested by Alstadsæter et al. (2019) for Scandinavia), it remains to be seen how these adjustments would affect international comparisons of inequality. Appendix 3 shows graphs with both the original and adjusted series of Spanish top income concentration, together with those of other Western countries, which are based on reported income tax data. As made evident by the case of Spain, adjustments for fraud could affect the ranking and grouping of countries, as well as our account of the evolution of top income concentration through time. We have an interesting agenda ahead.

5. Final discussion

Tax evasion is a popular topic in public debate today. Folk wisdom has it that it is pervasive and unequally distributed, concentrating among the rich and the self-employed. Its existence would render tax systems unfair and less able to reduce inequality, and there is much claim for fighting against it, not less in the aftermath of several recent offshore scandals. The economic literature is increasingly responding to this interest.

This paper analyses the Spanish personal income tax in the early 2000s, with a focus on differences across income sources and levels. The estimate exploits the relation between reported charitable donations and the composition of income in the tax microdata. The underlying assumption is that donations should not be affected by the origin of incomes, once controlling for levels – an idea developed by Feldman and Slemrod (2007), and applied to Spain for 2008 by Domínguez-Barrero et al. (2015), as well as to several other countries in various studies. Nonetheless, two methodological innovations are introduced here. The first is a correction for sample selection using Heckman’s two-step estimation because returns with charitable donations in Spain are a small and distinct part of the total. The second is an interaction term to control whether under-reporting varies across income levels. Both suggestions may be useful for further work in other contexts.

The equations yield different results for incomes from disparate sources. Taking labour as fully compliant, movable capital incomes would be reported by 50-70 per cent of their real value throughout these years – with the lowest estimates corresponding to the top decile. Revenues from fixed capital display similarly low reporting rates only for the top 10 per cent (except for 2003, were also for the rest of the distribution). Tax evasion in self-employment incomes is disentangled only for 2001, where it stands at 50-20 per cent.

Because of the differences between revenue sources and the varying composition of total income across the social ladder, negative impacts of evasion on progressivity, redistribution and inequality were expected. This is confirmed by the calculations, which show a bias of 16-6 per cent for the redistribution index in the official statistics. Inequality would have been more reduced by a tax with no fraud, with 1-1.5 more points of decrease in the Gini index, and higher effective tax rates on affluent taxpayers (for example, 31 instead of 26 per cent for the top 1 per cent in 2004). Top income concentration estimates are also affected, especially (in relative terms) for the top 1 per cent. The results thus suggest that it is necessary to take into account the impacts of fraud if we want to gauge real inequality using tax data.


Top income concentration trends across countries, 1960-2014

Figure A1.

Top income concentration trends across countries, 1960-2014

Composition of the sample regarding donations

S Freq (%) Mean income Freq (%) Mean income
2001 2002
0 313,059 85.8 21,034 330,370 87.5 21,477
1 51,908 14.2 36,757 47,189 12.5 36,493
Total 364,967 100.0 22,562 377,559 100.0 22,711
2003 2004
0 291,146 85.8 19,586 308,704 84.0 20,693
1 48,259 14.2 36,836 58,790 16.0 39,501
Total 339,405 100.0 21,137 367,494 100.0 22,722

Only observations with positive total income. Income is in nominal euros and refers to the sum of net revenues from all sources

Source: Author’s calculations on IEF tax return microdata

Regression results

2001 2002 2003 2004
Model baseline top10 baseline top10 baseline top10 baseline top10
Dependent variable ln(don) ln(don) ln(don) ln(don) ln(don) ln(don) ln(don) ln(don)
Income 0.592*** (0.0377) 0.643*** (0.0464) 0.595*** (0.0318) 0.609*** (0.0427) 0.509*** (0.0562) 0.571*** (0.0732) 0.420*** (0.0523) 0.431*** (0.0671)
Income * top10 −0.0259 (0.0252) −0.00261 (0.0266) −0.0444 (0.0288) −0.0135 (0.0257)
Movable capital 1.901*** (0.144) 1.638*** (0.151) 1.474*** (0.109) 1.347*** (0.142) 1.463*** (0.146) 0.992 (0.184) 1.748*** (0.201) 1.407 (0.256)
Movable capital * top10 0.469*** (0.234) 0.205*** (0.198) 0.604*** (0.254) 0.474*** (0.336)
Fixed capital 1.351*** (0.100) 1.039 (0.0887) 1.264*** (0.0944) 1.113 (0.0991) 1.577*** (0.166) 1.326* (0.167) 1.315** (0.148) 1.051 (0.154)
Fixed capital * top10 0.937*** (0.230) 0.417*** (0.197) 0.381*** (0.263) 0.517** (0.284)
Self-empl. direct 1.142*** (0.0436) 1.042 (0.0519) 1.008 (0.0386) 1.027 (0.0545) 0.921 (0.0487) 1.092 (0.0892) 0.912 (0.0502) 1.146 (0.0997)
Self-empl. direct * top10 0.184*** (0.0809) −0.0197 (0.0753) −0.190 (0.107) −0.298 (0.122)
Self-empl. standard 1.098 (0.0859) 0.928 (0.0753) 0.926 (0.0742) 0.829 (0.0760) 0.654 (0.0721) 0.666 (0.0743) 0.648 (0.0765) 0.632 (0.0848)
Self-empl. stand. * top10 0.960*** (0.310) 0.409 (0.208) 0.122 (0.172) 0.168 (0.210)
Other incomes 1.158 (0.0992) 1.726*** (0.207) 0.943 (0.0881) 1.375* (0.220) 0.986 (0.0804) 1.125 (0.153) 0.995 (0.0932) 1.228 (0.173)
Other incomes * top10 −0.652 (0.232) −0.501 (0.236) −0.157 (0.177) −0.324 (0.205)
Negative incomes −0.327*** (0.502) −0.0140** (0.417) −0.794*** (0.551) −0.614*** (0.532) 0.120* (0.503) 0.225 (0.432) 0.624 (0.453) 0.762 (0.397)
Lambda 0.210* (0.122) 0.329** (0.128) 0.265** (0.117) 0.308** (0.124) −0.0565 (0.165) 0.0286 (0.182) −0.335** (0.151) −0.339** (0.168)
Observations 51,907 51,907 47,189 47,189 48,259 48,259 58,787 58,787
R2 0.160 0.161 0.171 0.171 0.176 0.176 0.164 0.164

***p < 0.01,

**p < 0.05; *p < 0.1. The null hypotheses are k <= 1, k + t <= 1 for t, β = 0 for the rest of variables (k > =1 for negative incomes). Results correspond to running equations (4) and (5). The estimation has been run using Stata (nl). For the list of control variables, see the text

Source: Author’s calculations with IEF tax microdata

Estimated compliance ratios for different sources and levels of income

2001 2002 2003 2004
Income type bottom90 top10 bottom90 top10 bottom90 top10 bottom90 top10
Labour (100%) (100%) (100%) (100%) (100%) (100%) (100%) (100%)
Movable capital 61% 47% 74% 64% 63% 53%
Fixed capital 51% 65% 75% 59% 64%
Self-employment dir 82%
Self-employment std 53%
Other incomes 58% 73%

The compliance ratio is 1/k for the bottom 90 per cent when k > 1, 1/(k + t) for the top decile when (k + t) > 1. For example, compliance of movable capital incomes in 2001 is estimated as 1/1.638 = 0.61 for the bottom90, and 1/(1.638 + 0.469) = 0.47 for the top10. In this case, both conditions k > 1 and (k + t) > 1 hold. Self-employment activities are separated according to the assessment procedure: accountancy-based (“direct”) or presumptive (“standard”). The impact of non-filing is not included here

Source: Author’s calculations with coefficients from Table II, columns “top10”

Impact of fraud on tax progressivity and inequality

Original Corrected Diff. (%) No fraud Diff. (%) Original Corrected Diff. (%) No fraud Diff. (%)
Indicators (1) (2) (2 − 1)/(1) (3) (3 − 2)/(2) (1) (2) (2 − 1)/(1) (3) (3 − 2)/(2)
2001 2002
Pre-tax Gini 37.50 39.60 6 39.60 0 36.88 37.60 2 37.60 0
Post-tax Gini 32.30 35.21 9 33.74 −4 31.72 32.75 3 32.20 −2
Average tax rate 15.79 14.76 −7 17.25 17 15.86 15.43 −3 16.43 6
Redistribution 5.20 4.39 −16 5.86 33 5.16 4.85 −6 5.40 11
Progressivity 28.25 25.90 −8 28.53 10 27.88 27.09 −3 27.93 3
Tax rate top 10% 23.84 21.18 −11 25.76 22 24.10 23.02 −4 24.85 8
Tax rate top 1% 33.97 27.17 −20 36.70 35 34.12 31.34 −8 35.18 12
2003 2004
Pre-tax Gini 38.80 39.72 2 39.72 0 40.82 42.04 3 42.04 0
Post-tax Gini 33.88 35.07 4 34.55 −1 35.91 37.46 4 36.82 −2
Average tax rate 13.74 13.43 −2 14.36 7 14.33 14.01 −2 15.01 7
Redistribution 4.93 4.64 −6 5.17 11 4.91 4.58 −7 5.23 14
Progressivity 31.45 30.48 −3 31.35 3 29.87 28.61 −4 30.05 5
Tax rate top 10% 21.96 21.02 −4 22.75 8 22.45 21.39 −5 23.28 9
Tax rate top 1% 30.86 28.15 −9 32.09 14 29.70 26.42 −11 31.05 18

In all cases, the sum of net revenues from all sources is the income reference. The “original” scenario is the estimate readily obtained from the data, affected by under-reporting. “Corrected” shows the real behaviour of the tax, if evasion was distributed as obtained, while the “no fraud” scenario shows how the tax would have been distributed under full compliance. The redistribution indicator is the Reynolds–Smolensky index, corresponding to the difference between the Gini of pre-tax and post-tax incomes. The progressivity indicator is the Kakwani index: the difference between the pre-tax Gini and the concentration of tax payments. The tax rates for the top 10 and 1% refer to the distribution of corrected incomes, and they are calculated as tax due over total (real) income (then averaged for the relevant group). To improve readability, all indices have been multiplied by 100. The existence of special treatment for long-term capital gains (“base especial”) has been included in the calculations. This “base especial” had a uniform tax rate of 18% in 2001-2002 and of 15% in 2003-2004

Source: Author’s calculations, using Stata “progress” module

The impact of evasion on top income shares

Reported data Fraud-adjusted data
Top 10 (%) Top 1 (%) Top 10 (%) Top 1 (%)
2001 34.9 9.8 40.4 12.9
2002 34.2 9.5 36.9 10.9
2003 34.5 10.0 36.2 10.8
2004 34.4 10.2 38.9 12.9

The top 10 and 1% here refer to population over 20, following Alvaredo and Saez (2009)

Source: Reported data from Alvaredo and Saez (2009), fraud-adjusted data from the author’s calculations

Testing the exclusion restriction

2001 2002 2003 2004
P > z in probit regression (s = 1) 0.000 0.000 0.000 0.000
P > t if included in donations equation 0.062 0.701 0.367 0.885
P > t in regression of residuals from donations equation 0.400 0.870 0.556 0.921

Source: Author’s calculations with IEF microdata

Impact of fraud on tax progressivity and inequality

Original Corrected Diff (%) No fraud Diff (%) Original Corrected Diff (%) No fraud Diff (%)
(1) (2) (2 − 1)/(1) (3) (3 − 2)/(2) (1) (2) (2 − 1)/(1) (3) (3 − 2)/(2)
2001 2002
Pre-tax Gini 37.50 39.60 6 39.60 0 36.88 37.59 2 37.59 0
Post-tax Gini 32.30 35.70 11 33.65 −6 31.72 33.30 5 32.09 −4
Average tax rate 15.79 13.28 −16 18.65 40 15.86 13.88 −12 17.88 29
Redistribution 5.20 3.89 −25 5.94 53 5.16 4.29 −17 5.50 28
Progressivity 28.25 25.90 −8 26.30 2 27.88 27.09 −3 25.67 −5
Tax rate top 10% 23.84 19.06 −20 27.37 44 24.10 20.71 −14 26.51 28
Tax rate top 1% 33.97 24.44 −28 37.51 53 34.10 28.18 −17 36.12 28
2003 2004
Pre-tax Gini 38.80 39.71 2 39.71 0 40.82 42.04 3 42.04 0
Post-tax Gini 33.88 35.59 5 34.44 −3 35.91 37.97 6 36.75 −3
Average tax rate 13.74 12.08 −12 15.79 31 14.33 12.61 −12 16.40 30
Redistribution 4.93 4.12 −16 5.27 28 4.91 4.07 −17 5.29 30
Progressivity 31.45 30.48 −3 28.53 −6 29.87 28.61 −4 27.35 −4
Tax rate top 10% 21.96 18.92 −14 24.39 29 22.43 19.25 −14 24.86 29
Tax rate top 1% 30.85 25.33 −18 32.98 30 29.69 23.77 −20 31.78 34

In all cases, the sum of net revenues from all sources is the income reference. The “original” scenario is the estimate readily obtained from the data, affected by under-reporting. “Corrected” shows the real behaviour of the tax, if evasion was distributed as obtained, plus including a 10% evasion in labour incomes (and thus, adjusting all other evasion rates upwards), while the “no fraud” scenario shows how the tax would have been distributed under full compliance. The redistribution indicator is the Reynolds–Smolensky index, corresponding to the difference between the Gini of pre-tax and post-tax incomes. The progressivity indicator is the Kakwani index: the difference between the pre-tax Gini and the concentration of tax payments. The tax rates for the top 10 and 1% refer to the distribution of corrected incomes, and they are calculated as tax due over total (real) income (then averaged for the relevant group). To improve readability, all indices have been multiplied by 100. The existence of special treatment for long-term capital gains (“base especial”) has been included in the calculations. This “base especial” had a uniform tax rate of 18% in 2001-2002 and of 15% in 2003-2004

Source: Author’s calculations, using Stata “progress” module



OECD tax statistics, 1965-2014.


Alvaredo and Saez (2009, p. 4) discuss fraud in some detail, and conclude that “income tax evasion in Spain before 1980 was much less prevalent than previously thought at the top of the distribution”. This remark refers to the period of the old income tax (Contribución General sobre la Renta), which only affected very wealthy taxpayers. This paper looks at the modern income tax introduced in 1979, a much more widespread contribution.


With respect to tax morale, how it may vary by income level is uncertain. It is not used as an argument for the distribution of evasion in the paper. See a discussion e.g. in Lago-Peñas and Lago-Peñas (2010). Other explanations have been put forth in the literature as to why high-income individuals would evade more; see e.g. Alstadsæter et al. (2019).


Regarding Spain, see Torregrosa-Hetland (2015) for an overview on previous estimates of income tax evasion.


The data comes from the Panel de Declarantes de IRPF 1999-2008. These observations correspond to income tax filers. The IEF also provides data on non-filers (withheld incomes), but these are not used here since, by definition, they do not have reported donations, and thus, cannot be included in the estimation. All results are thus confined to the population of tax filers. Throughout the paper, the income year is used as a reference: i.e. data from 2001 are incomes earned that year, and reported in 2002.


Individuals who did not make a return at all are missing from our data. The behaviour of these non-filers is certainly a part of fraud in an aggregate sense, and was a problem of considerable magnitude in Spain at the beginning of the period, as in other economies at similar levels of development. See Torregrosa-Hetland (2015) for an approximation to its impact, obtained by a discrepancy analysis between tax data and national accounts.


Of course, legal under-assessment affects equity between taxpayers because different income sources represent disparate shares in each citizen’s total income. Examples of legal under-assessment arise in the imputation of incomes from housing (which actually excludes the owner-occupied home in our period) or in the privileged treatment of capital incomes and capital gains. Aggressive tax planning by high-income individuals, by delaying taxation, would have related effects (see the considerations in e.g. Rubolino and Waldenström, 2017).


For the specific derivation of the model, see Feldman and Slemrod (2007, p. 333) and following.


The type of tax return variable reflects the fact that optional separate filing for couples was introduced in 1989. Gini indices (disposable income) are taken from Ayala et al. (2006), and correspond to the 17 autonomous communities in Spain. I also use dummies for five geographical macro-regions, defined in the following way: South (Andalucía, Murcia, Canary Islands, Ceuta and Melilla), East (Baleares, Cataluña and Valencia), Centre-North (Aragón, Castilla y León and Madrid), Centre-South (Castilla-La Mancha and Extremadura) and North (Asturias, Cantabria, Galicia and La Rioja). Education level is not available in the tax microdata.


The percentage of deduction was set at 20 per cent in 1994 and, for the years 2002-2004, it lied at 25 per cent. The tax credit was limited to 10-15 per cent of the taxable base.


Following Domínguez-Barrero et al. (2015), who applied this method to the Spanish income tax in 2008, an alternative estimation has been run for 2001 where pensions are taken as the only compliant income source. There is, however, no significant change in the coefficients, and the behaviour of wages cannot be statistically distinguished from that of pensions (similarly to the results in Domínguez-Barrero et al., 2017 for 2005-2007).


Over-reporting of donations would be a problem if it were related to the composition of an individual’s income. However, already Feldman and Slemrod argued that over-reporting donations would not to be rational in combination with under-reporting income, as it could trigger the attention of the tax administration. Consistent with this appreciation, Fack and Landais (2016) found that, in France, wage earners and low income taxpayers tended to over-report their donations to a greater extent, as they had less capacity to under-report incomes or abuse other deductions.


Regressions have been run using the average incomes of 2001-2004 to approximate this. However, the sample we can use for this estimate is a reduced one, as we need taxpayers who were present in the panel throughout the four years. Taxpayers might drop off the panel because of their incomes falling below the filing requirement, moving to other country (or dying). In practice, this means that for an estimate of average income using the four years 32 per cent of the initial observations are excluded. Inspection of this selected sample shows that it is significantly different from the original one in that it has higher incomes, higher donations and a lower percentage of self-employed individuals. The exercise has, therefore, not been considered a valid approach.


In principle, a Tobit model is another option to deal with the fact that many taxpayers did not make any donations. The condition for this strategy, however, is that the two decisions (to give or not to give, and what amount to donate) are essentially affected in the same direction by the same factors, which does not seem to hold here, as we obtain different signs for some variables in the two stages of our estimation. In any case, a Tobit model has also been run, as well as simple regressions with the whole sample à la Feldman and Slemrod (2007) (adding 1 to donations before logging). The application of a Heckman estimation to account for selection in charitable contributions is very novel in the literature; to the best of our knowledge it has only been explored previously in Borgloh (2008) with German data (but not for estimating tax evasion).


The wealth dummy has been tested for its relation to s, and lack of relation to donations and the error in equation (4). See Appendix 1.


Recent papers in the charitable donations literature have addressed such variation in income elasticities; e.g. Bakija and Heim (2011) and Adena (2014).


Additional estimations for models without control for sample selection, using both ln(don) and ln(don+1) as dependent variables, are available upon request.


This lack of significant differences in income elasticity by income level is consistent with the findings of Bakija and Heim (2011) for the USA, although it is not the focus of their paper.


Labour incomes are taken as the most accurately reported, but as said before their compliance is not necessarily 100 per cent, and could change during the period. As rates in corporate taxation were well below the top personal rates, a degree of income shifting is definitely expected. Nominal rates for corporations lied at 33-35 per cent, with effective rates under 30 per cent (Del Blanco García et al., 2011), whereas top marginal rates for individuals were 48 per cent in 2001-2002 and 45 per cent in 2003-2004. Income shifting practices could have become gradually more widespread in a “learning” process. On shifting to capital gains income, which was favourably treated, see López-Laborda et al. (2018).


In any case, comparison is limited by the fact that several methodologies coexist in the available studies: aggregate discrepancy with national accounting, use of household surveys, tax audit data, etc.


The calculations in Table IV have also been performed with an alternative assumption of 10 per cent of under-reporting in labour incomes. See Appendix 2.


Of course, a full compliance scenario is highly implausible. The exercise serves as an indication of the intensity of the distortion, and not as a credible policy objective. The estimated tax gap in terms of tax liability lies around 1 per cent of GDP – a lower-bound, as it neither include the Basque Country nor Navarre. This is near the difference in tax revenue from the income tax between Spain and other Western European countries.


The calculation for the 1980s is from the Comisión para el Estudio del Fraude en el IRPF (1988), which obtained for labour incomes a compliance rate of between 54 and 71 per cent between 1979 and 1986 (increasing throughout the period, which corresponds to the first eight years of existence of the modern income tax in Spain). The tax gap of 20 per cent was found for 1993-2001 by Esteller (2011, p. 293) – but note that this is the gap in terms of tax due, not of income under-reporting.

Appendix 1. Testing the exclusion restriction on wealth

Table AI shows the results of testing the exclusion restriction on the variable used as instrument, wealth. It can be seen that this variable is:

  • correlated with the probability of selection (i.e., of making a donation);

  • uncorrelated with the level of donations in the second equation; and

  • uncorrelated with the error in the donations equation.

Appendix 2. Inequality effects with under-reporting in labour incomes

The calculations in Table IV have also been performed with an alternative assumption of 10 per cent of under-reporting in labour incomes. This 10 per cent is only a working hypothesis, which does not derive directly from empirical work, so the exercise is only meant to show possible magnitudes and directions of the effect, but is not to be taken as a reliable estimate. It is, however, an informed hypothesis in the sense that it situates compliance in labour above what was estimated for the 1980s, and consistent with a total tax gap of 20 per cent in the 1990s-early 2000s.[23]

The results are shown in Table AII. It should be noted that, as in the corresponding table in the text, all estimated indices correspond to current taxpayers. In a no-fraud scenario, some of the non-filers would make a tax return, and that would specially be the case when considering the existence of evasion in labour incomes (which are more important at the bottom of the distribution). The possible impact of this is not shown in the table. We can only suppose that it would increase inequality indices further, and possibly also show a higher estimated progressivity.

Table AII and Table IV are similar not only in many respects but also different in some. Pre-tax inequality in the “corrected” and “no fraud” scenarios are the same, as all incomes have been adjusted by the same factor. For all other indices but progressivity, the differences between scenarios “original” and “corrected” are higher in this version. For example, the effective tax rate for the top 1 per cent was shown to be underestimated by 11 per cent in our baseline estimation, while here we get an estimate of 20 per cent – a mechanic effect of the bigger tax base. The “no fraud” scenario also shows higher impacts in this alternative estimation for most indicators. Higher reported incomes would of course lead to higher tax payments, and thus, higher redistribution. On the other hand, the differences in progressivity between the “corrected” and “no fraud” scenarios are lower in this case, and even negative for years 2002-04 – meaning that the tax would have been less progressive with no fraud! This is an effect of increased taxation also on those taxpayers with mostly labour income. As mentioned in the previous paragraph, however, the non-consideration of non-filers in this estimate does not allow us to arrive at any strong conclusions in this respect.

Appendix 3. Comparing top income concentration

Figure A1


Adena, M. (2014), “Tax-price elasticity of charitable donations: evidence from the German taxpayer panel”, WZB Discussion Paper No. SP II 2014-302, Wissenschaftszentrum Berlin für Sozialforschung (WZB), Berlin.

Albarea, A., Bernasconi, M., Novi, C.D., Marenzi, A., Rizzi, D. and Zantomio, F. (2015), “Accounting for tax evasion profiles and tax expenditures in microsimulation modelling. The BETAMOD model for personal income taxes in Italy”, Working Papers 2015:24, Department of Economics, University of Venice “Ca’ Foscari”.

Allingham, M.G. and Sandmo, A. (1972), “Income tax evasion: a theoretical analysis”, Journal of Public Economics, Vol. 1 Nos 3/4, pp. 323-338.

Alm, J., Bahl, R. and Murray, M.N. (1991), “Tax base erosion in developing countries”, Economic Development and Cultural Change, Vol. 39 No. 4, pp. 849-872.

Alstadsæter, A. and Jacob, M. (2016), “Dividend taxes and income shifting”, The Scandinavian Journal of Economics, Vol. 118 No. 4, pp. 693-717.

Alstadsæter, A. and Jacob, M. (2017), “Who participates in tax avoidance? Evidence from Swedish microdata”, Applied Economics, Vol. 49 No. 28, pp. 2779-2796.

Alstadsæter, A., Johannesen, N. and Zucman, G. (2019), “Tax evasion and inequality”, American Economic Review, Vol. 109 No. 6, pp. 2073-2103.

Alstadsæter, A., Jacob, M., Kopczuk, W. and Telle, K. (2016), “Accounting for business income in measuring top income shares: integrated accrual approach using individual and firm data from Norway”, NBER Working Papers 22888, National Bureau of Economic Research.

Alvaredo, F. and Saez, E. (2009), “Income and wealth concentration in Spain from a historical and fiscal perspective”, Journal of the European Economic Association, Vol. 7 No. 5, pp. 1140-1167.

Andreoni, J., Erard, B. and Feinstein, J. (1998), “Tax compliance”, Journal of Economic Literature, Vol. 36 No. 2, pp. 818-860.

Atkinson, A. and Piketty, T. (2007), Top Incomes over the Twentieth Century: A Contrast between Continental European and English-Speaking Countries, Oxford University Press, Oxford.

Ayala, L., Jurado, A. and Pedraja, F. (2006), “Desigualdad y bienestar en la distribución intraterritorial de la renta, 1973-2000”, Investigaciones Regionales: Journal of Regional Research, Vol. 8, pp. 5-30.

Bakija, J. and Heim, B.T. (2011), “How does charitable giving respond to incentives and income? New estimates from panel data”, National Tax Journal, Vol. 64 No. 2, Part 2, pp. 615-650.

Benedek, D. and Lelkes, O. (2011), “The distributional implications of income under-reporting in Hungary”, Fiscal Studies, Vol. 32 No. 4, pp. 539-560.

Bishop, J., Formby, J. and Lambert, P. (2000), “Redistribution through the income tax: the vertical and horizontal effects of non compliance and tax evasion”, Public Finance Review, Vol. 28 No. 4, pp. 335-350.

Black, T., Bloomquist, K., Emblom, E., Johns, A., Plumley, A. and Stuk, E. (2012), “Federal tax compliance research: tax year 2006 tax gap estimation”, Research, Analysis and Statistics Working Paper, Internal Revenue Service.

Borgloh, S. (2008), “What drives giving in extensive welfare states? The case of Germany”, ZEW Discussion Paper No. 08-123, Zentrum für Europäische Wirtschaftsforschung (ZEW), Mannheim.

Clotfelter, C.T. (1983), “Tax evasion and tax rates: an analysis of individual returns”, The Review of Economics and Statistics, Vol. 65 No. 3, pp. 363-373.

Comisión para el Estudio del Fraude en el IRPF (1988), Evaluación Final Del Fraude en el IRPF en Los Ejercicios 1979 a 1986, Instituto de Estudios Fiscales, Madrid.

Del Blanco García, A., Gutiérrez, M., Alonso, D., Fernández de Beaumont, I., Martín, J. and Rodríguez, A. (2011), “Evolución del sistema fiscal español: 1978-2010”, Papeles de Trabajo 13, Instituto de Estudios Fiscales, Madrid.

Domínguez-Barrero, F., López-Laborda, J. and Rodrigo-Sauco, F. (2015), “El hueco que deja el diablo: una estimación del fraude en el IRPF con microdatos tributarios”, Revista de Economía Aplicada, Vol. 23 No. 68, pp. 81-102.

Domínguez-Barrero, F., López-Laborda, J. and Rodrigo-Sauco, F. (2017), “Tax evasion in Spanish personal income tax by income sources, 2005-2008: from the synthetic to the dual tax”, European Journal of Law and Economics, Vol. 44 No. 1, pp. 47-65.

Engström, P. and Hagen, J. (2017), “Income underreporting among the self-employed: a permanent income approach”, European Economic Review, Vol. 92, pp. 92-109.

Esteller, A. (2011), “Is the tax administration just a money machine? Empirical evidence on redistributive politics”, Economics of Governance, Vol. 12 No. 3, pp. 275-299.

Fack, G. and Landais, C. (2016), “The effect of tax enforcement on tax elasticities: evidence from charitable contributions in France”, Journal of Public Economics, Vol. 133, pp. 23-40.

Feinstein, J. (1991), “An econometric analysis of income tax evasion and its detection”, The Rand Journal of Economics, Vol. 22 No. 1, pp. 14-35.

Feldman, N.E. and Slemrod, J. (2007), “Estimating tax noncompliance with evidence from unaudited tax returns”, The Economic Journal, Vol. 117 No. 518, pp. 327-352.

Freire-Serén, M.J. and Panadés, J. (2008), “Does tax evasion modify the redistributive effect of tax progressivity?”, Economic Record, Vol. 84 No. 267, pp. 486-495.

Johns, A. and Slemrod, J. (2010), “The distribution of income tax noncompliance”, National Tax Journal, Vol. 63 No. 3, pp. 397-418.

Kleven, H.J., Knudsen, M.B., Kreiner, C.T., Pedersen, S. and Saez, E. (2011), “Unwilling or unable to cheat? Evidence from a randomized tax audit experiment in Denmark”, Econometrica, Vol. 79 No. 3, pp. 651-692.

Lago-Peñas, I. and Lago-Peñas, S. (2010), “The determinants of tax morale in comparative perspective: evidence from European countries”, European Journal of Political Economy, Vol. 26 No. 4, pp. 441-453.

López-Laborda, J., Vallés-Giménez, J. and Zárate-Marco, A. (2018), “Income shifting in the Spanish dual income tax”, Fiscal Studies, Vol. 39 No. 1, pp. 95-120.

Luttmer, E.F. and Singhal, M. (2014), “Tax morale”, Journal of Economic Perspectives, Vol. 28 No. 4, pp. 149-168.

Matsaganis, M. and Flevotomou, M. (2010), “Distributional implications of income tax evasion in Greece”, Hellenic Observatory Papers on Greece and Southeast Europe.

Onrubia Fernández, J. and Picos Sánchez, F. (2013), “Desigualdad de la renta y redistribución a través del IRPF, 1999-2007”, Revista de Economía Aplicada, Vol. 21 No. 63, pp. 75-115.

Pissarides, C.A. and Weber, G. (1989), “An expenditure-based estimate of Britain’s black economy”, Journal of Public Economics, Vol. 39 No. 1, pp. 17-32.

Raymond-Barà, J. (1987), “Tipos impositivos y evasión fiscal en españa: un análisis empírico”, Papeles de Economía Española, Vol. 30-31, pp. 154-169.

Rubolino, E. and and Waldenström, D. (2017), “Tax progressivity and top incomes: evidence from tax reforms”, IZA Discussion Paper No. 10666, IZA Institute of Labor Economics.

Slemrod, J. (1989), “Are estimated tax elasticities really just tax evasion elasticities? The case of charitable contributions”, The Review of Economics and Statistics, Vol. 71 No. 3, pp. 19-36.

Swedish National Tax Agency (2008), “Tax gap map for Sweden: how was it created and how can it be used?”, Skatteverket Report 2008:1B.

Torregrosa-Hetland, S. (2015), “Bypassing progressive taxation: fraud and base erosion in the Spanish income tax, 1970-2001”, IEB Working Paper 2015/31, Institut d’Economia de Barcelona.

Valdés, T. (1982), Los Métodos de Análisis Discriminante Como Herramienta al Servicio de la Inspección Fiscal, Instituto de Estudios Fiscales, Madrid.

World Wealth and Income Database (2018), “World wealth and income database”, available at:

Zucman, G. (2014), “Taxing across borders: Tracking personal wealth and corporate profits”, The Journal of Economic Perspectives, Vol. 28 No. 4, pp. 121-148.


The author acknowledge financial support from the Spanish Ministry of Education’s scholarship program Formación del Profesorado Universitario and the Research Project ECO2012-39169-C03-03. I am most grateful for the dicussions and constant encouragement of Alfonso Herranz Loncán and Alejandro Esteller Moré. The paper has also benefited from help and comments by Fernando Rodrigo and Emmanuel Saez, who hosted me at UC Berkeley while initiating this research, as well as from contributions of many seminar participants.

Corresponding author

Sara Torregrosa-Hetland can be contacted at: