Modelling and Evaluating Treatment Effects in Econometrics: Volume 21


Table of contents

(18 chapters)

The estimation of the effects of treatments – endogenous variables representing everything from child participation in a pre-kindergarten program to adult participation in a job-training program to national participation in a free trade agreement – has occupied much of the theoretical and applied econometric research literatures in recent years. This volume brings together a diverse collection of papers on this important topic by leaders in the field from around the world. This collection draws attention to several key facets of the recent evolution in this literature.

Regression analyses of compensatory educational programs have been criticized on the grounds that the pupils were not randomly selected. Specifically, it has been argued that a spurious deleterious effect of the treatment will be observed when the selection procedure systematically puts lower-ability students into the treatment group and higher-ability students into the control group.

We evaluate this argument via a simple test score model: pretest score and posttest score are fallible measures of underlying true ability and the true treatment effect is zero. Posttest is regressed on pretest and a treatment dummy. The spurious effect arises when selection of subjects for treatment is explicit on the basis of true ability, but not when it is explicit on the basis of pretest score.

This paper studies the event-history approach to microeconometric program evaluation. We present a mixed semi-Markov event-history model, discuss its application to program evaluation, and analyze its empirical content. The results of this paper provide fundamental insights into what can be learned from longitudinal microdata about, for example, the effects of training programs for the unemployed on their unemployment durations and subsequent job stability. They can guide the choice of particular models and methods for the empirical analysis of such effects.

We describe a new Bayesian estimation algorithm for fitting a binary treatment, ordered outcome selection model in a potential outcomes framework. We show how recent advances in simulation methods, namely data augmentation, the Gibbs sampler and the Metropolis-Hastings algorithm can be used to fit this model efficiently, and also introduce a reparameterization to help accelerate the convergence of our posterior simulator. Conventional “treatment effects” such as the Average Treatment Effect (ATE), the effect of treatment on the treated (TT) and the Local Average Treatment Effect (LATE) are adapted for this specific model, and Bayesian strategies for calculating these treatment effects are introduced. Finally, we review how one can potentially learn (or at least bound) the non-identified cross-regime correlation parameter and use this learning to calculate (or bound) parameters of interest beyond mean treatment effects.

I propose a general framework for instrumental variables estimation of the average treatment effect in the correlated random coefficient model, focusing on the case where the treatment variable has some discreteness. The approach involves adding a particular function of the exogenous variables to a linear model containing interactions in observables, and then using instrumental variables for the endogenous explanatory variable. I show how the general approach applies to binary and Tobit treatment variables, including the case of multiple treatments.

In an evaluation of a job training program, the causal effects of the program on wages are often of more interest to economists than the program's effects on employment or on income. The reason is that the effects on wages reflect the increase in human capital due to the training program, whereas the effects on total earnings or income may be simply reflecting the increased likelihood of employment without any effect on wage rates. Estimating the effects of training programs on wages is complicated by the fact that, even in a randomized experiment, wages are truncated by nonemployment, i.e., are only observed and well-defined for individuals who are employed. We present a principal stratification approach applied to a randomized social experiment that classifies participants into four latent groups according to whether they would be employed or not under treatment and control, and argue that the average treatment effect on wages is only clearly defined for those who would be employed whether they were trained or not. We summarize large sample bounds for this average treatment effect, and propose and derive a Bayesian analysis and the associated Bayesian Markov Chain Monte Carlo computational algorithm. Moreover, we illustrate the application of new code checking tools to our Bayesian analysis to detect possible coding errors. Finally, we demonstrate our Bayesian analysis using simulated data.

We show that in sorting cross-sectional data, the endogeneity of a variable may be successfully detected by graphically examining the cumulative sum of the recursive residuals. Moreover, the sign of the bias implied by the endogeneity may be deducible through such graphs. In general, instrumental variables are needed to implement the graphical test. However, when a continuous or ordered (e.g. years of schooling) variable is suspected to be endogenous, a graphical test for misspecification due to endogeneity (e.g. self-selection) can be obtained without instrumental variables.

Although the theoretical trade-off between the quantity and quality of children is well established, empirical evidence supporting such a causal relationship is limited. This chapter applies a recently developed nonparametric estimator of the conditional local average treatment effect to assess the sensitivity of the quantity–quality trade-off to functional form and parametric assumptions. Using data from the Indonesia Family Life Survey and controlling for the potential endogeneity of fertility, we find mixed evidence supporting the trade-off.

We apply a recently suggested econometric approach to measure the effects of active labor market programs on employment, unemployment, and wage histories among participants. We find that participation in most of these training programs produces an initial locking-in effect and for some even a lower transition rate from unemployment to employment upon completion. Most programs, therefore, increase the expected duration of unemployment spells. However, we find that the training undertaken while unemployed successfully increases the expected duration of subsequent spells of employment for many subpopulations. These longer spells of employment come at a cost of lower accepted hourly wage rates.

Programs are typically evaluated through the average treatment effect and its standard error. In particular, is the treatment effect positive and is it statistically significant? In theory, programs should be evaluated in a decision framework, using social welfare functions and posterior predictive distributions for outcomes of interest. This chapter discusses the use of stochastic dominance of predictive distributions of outcomes to rank programs, and, under more restrictive parametric and functional form assumptions, the chapter develops intuitive mean-variance tests for program evaluation that are consistent with the underlying decision problem. These concepts are applied to the GAIN and JTPA datasets.

Lechner and Miquel (2001) approached the causal analysis of sequences of interventions from a potential outcome perspective based on selection-on-observables-type assumptions (sequential conditional independence assumptions). Lechner (2004) proposed matching estimators for this framework. However, many practical issues that might have substantial consequences for the interpretation of the results have not been thoroughly investigated so far. This chapter discusses some of these practical issues. The discussion is related to estimates based on an artificial data set for which the true values of the parameters are known and that shares many features of data that could be used for an empirical dynamic matching analysis.

This chapter demonstrates that fixed-effects and first-differences models often understate the effect of interest because of the variation used to identify the model. In particular, the within-unit time-series variation often reflects transitory fluctuations that have little effect on behavioral outcomes. The data in effect suffer from measurement error, as a portion of the variation in the independent variable has no effect on the dependent variable. Two empirical examples are presented: one on the relationship between AFDC and fertility and the other on the relationship between local economic conditions and AFDC expenditures.

In this chapter, we characterise the selection into parenthood for men and women separately and estimate the effects of motherhood and fatherhood on wages. We apply propensity score matching exploiting an extensive high-quality register-based data set augmented with family background information. We estimate the net effects of parenthood and find that mothers receive 7.4% lower average wages compared to non-mothers, whereas fathers gain 6.0% in terms of average wages from fatherhood.

In this chapter, we evaluate the employment effects of job-creation schemes (JCS) on the participating individuals in Germany. JCS are a major element of active labour market policy in Germany and are targeted at long-term unemployed and other hard-to-place individuals. Access to very informative administrative data of the Federal Employment Agency justifies the application of a matching estimator and allows us to account for individual (group-specific) and regional effect heterogeneity. We extend previous studies for Germany in four directions. First, we are able to evaluate the effects on regular (unsubsidised) employment. Second, we observe the outcomes of participants and non-participants for nearly three years after the programme starts and can therefore analyse medium-term effects. Third, we test the sensitivity of the results with respect to various decisions that have to be made during implementation of the matching estimator. Finally, we check if a possible occurrence of a specific form of ‘unobserved heterogeneity’ distorts our interpretation. The overall results are rather discouraging, since the employment effects are negative or insignificant for most of the analysed groups. One exception are long-term unemployed individuals who benefit from participation at the end of our observation period. Hence, one policy implication is to address the programmes to this problem group more closely.

Publication date
Book series
Advances in Econometrics
Series copyright holder
Emerald Publishing Limited
Book series ISSN