Regression Discontinuity Designs: Volume 38

Cover of Regression Discontinuity Designs

Theory and Applications

Subject:

Table of contents

(14 chapters)

Prelims

Pages i-xxv
click here to view access options
Abstract

We discuss the two most popular frameworks for identification, estimation and inference in regression discontinuity (RD) designs: the continuity-based framework, where the conditional expectations of the potential outcomes are assumed to be continuous functions of the score at the cutoff, and the local randomization framework, where the treatment assignment is assumed to be as good as randomized in a neighborhood around the cutoff. Using various examples, we show that (i) assuming random assignment of the RD running variable in a neighborhood of the cutoff implies neither that the potential outcomes and the treatment are statistically independent, nor that the potential outcomes are unrelated to the running variable in this neighborhood; and (ii) assuming local independence between the potential outcomes and the treatment does not imply the exclusion restriction that the score affects the outcomes only through the treatment indicator. Our discussion highlights key distinctions between “locally randomized” RD designs and real experiments, including that statistical independence and random assignment are conceptually different in RD contexts, and that the RD treatment assignment rule places no restrictions on how the score and potential outcomes are related. Our findings imply that the methods for RD estimation, inference, and falsification used in practice will necessarily be different (both in formal properties and in interpretation) according to which of the two frameworks is invoked.

Abstract

This chapter reviews recent developments in the density discontinuity approach. It is well known that agents having perfect control of the forcing variable will invalidate the popular regression discontinuity designs (RDDs). To detect the manipulation of the forcing variable, McCrary (2008) developed a test based on the discontinuity in the density around the threshold. Recent papers have noted that the sorting patterns around the threshold are often either the researcher’s object of interest or may relate to structural parameters such as tax elasticities through known functions. This, in turn, implies that the behavior of the distribution around the threshold is not only informative of the validity of a standard RDD; it can also be used to recover policy-relevant parameters and perform counterfactual exercises.

Abstract

Using administrative, longitudinal data on felony arrests in Florida, we exploit the discontinuous increase in the punitiveness of criminal sanctions at 18 to estimate the deterrence effect of incarceration. Our analysis suggests a 2% decline in the log-odds of offending at 18, with standard errors ruling out declines of 11% or more. We interpret these magnitudes using a stochastic dynamic extension of Becker’s (1968) model of criminal behavior. Calibrating the model to match key empirical moments, we conclude that deterrence elasticities with respect to sentence lengths are no more negative than 0 . 13 for young offenders.

Abstract

We study research designs where a binary treatment changes discontinuously at the border between administrative units such as states, counties, or municipalities, creating a treated and a control area. This type of geographically discontinuous treatment assignment can be analyzed in a standard regression discontinuity (RD) framework if the exact geographic location of each unit in the dataset is known. Such data, however, is often unavailable due to privacy considerations or measurement limitations. In the absence of geo-referenced individual-level data, two scenarios can arise depending on what kind of geographic information is available. If researchers have information about each observation’s location within aggregate but small geographic units, a modified RD framework can be applied, where the running variable is treated as discrete instead of continuous. If researchers lack this type of information and instead only have access to the location of units within coarse aggregate geographic units that are too large to be considered in an RD framework, the available coarse geographic information can be used to create a band or buffer around the border, only including in the analysis observations that fall within this band. We characterize each scenario, and also discuss several methodological challenges that are common to all research designs based on geographically discontinuous treatment assignments. We illustrate these issues with an original geographic application that studies the effect of introducing copayments for the use of the Children’s Health Insurance Program in the United States, focusing on the border between Illinois and Wisconsin.

Abstract

This chapter analyzes a geographic quasi-experiment embedded in a cluster-randomized experiment in Honduras. In the experiment, average treatment effects of conditional cash transfers on school enrollment and child labor were large – especially in the poorest experimental blocks – and could be generalized to a policy-relevant population given the original sample selection criteria. In contrast, the geographic quasi-experiment yielded point estimates that, for two of three dependent variables, were attenuated. A judicious policy analyst without access to the experimental results might have provided misleading advice based on the magnitude of point estimates. We assessed two main explanations for the difference in point estimates, related to external and internal validity.

Abstract

Relative to the randomized controlled trial (RCT), the basic regression discontinuity (RD) design suffers from lower statistical power and lesser ability to generalize causal estimates away from the treatment eligibility cutoff. This chapter seeks to mitigate these limitations by adding an untreated outcome comparison function that is measured along all or most of the assignment variable. When added to the usual treated and untreated outcomes observed in the basic RD, a comparative RD (CRD) design results. One version of CRD adds a pretest measure of the study outcome (CRD-Pre); another adds posttest outcomes from a nonequivalent comparison group (CRD-CG). We describe how these designs can be used to identify unbiased causal effects away from the cutoff under the assumption that a common, stable functional form describes how untreated outcomes vary with the assignment variable, both in the basic RD and in the added outcomes data (pretests or a comparison group’s posttest). We then create the two CRD designs using data from the National Head Start Impact Study, a large-scale RCT. For both designs, we find that all untreated outcome functions are parallel, which lends support to CRD’s identifying assumptions. Our results also indicate that CRD-Pre and CRD-CG both yield impact estimates at the cutoff that have a similarly small bias as, but are more precise than, the basic RD’s impact estimates. In addition, both CRD designs produce estimates of impacts away from the cutoff that have relatively little bias compared to estimates of the same parameter from the RCT design. This common finding appears to be driven by two different mechanisms. In this instance of CRD-CG, potential untreated outcomes were likely independent of the assignment variable from the start. This was not the case with CRD-Pre. However, fitting a model using the observed pretests and untreated posttests to account for the initial dependence generated an accurate prediction of the missing counterfactual. The result was an unbiased causal estimate away from the cutoff, conditional on this successful prediction of the untreated outcomes of the treated.

Abstract

Conventional tests of the regression discontinuity design’s identifying restrictions can perform poorly when the running variable is discrete. This paper proposes a test for manipulation of the running variable that is consistent when the running variable is discrete. The test exploits the fact that if the discrete running variable’s probability mass function satisfies a certain smoothness condition, then the observed frequency at the threshold has a known conditional distribution. The proposed test is applied to vote tally distributions in union representation elections and reveals evidence of manipulation in close elections that is in favor of employers when Republicans control the NLRB and in favor of unions otherwise.

Abstract

Regression discontinuity (RD) models are commonly used to nonparametrically identify and estimate a local average treatment effect. Dong and Lewbel (2015) show how a derivative of this effect, called treatment effect derivative (TED) can be estimated. We argue here that TED should be employed in most RD applications, as a way to assess the stability and hence external validity of RD estimates. Closely related to TED, we define the complier probability derivative (CPD). Just as TED measures stability of the treatment effect, the CPD measures stability of the complier population in fuzzy designs. TED and CPD are numerically trivial to estimate. We provide relevant Stata code, and apply it to some real datasets.

Abstract

A regression kink design (RKD or RK design) can be used to identify casual effects in settings where the regressor of interest is a kinked function of an assignment variable. In this chapter, we apply an RKD approach to study the effect of unemployment benefits on the duration of joblessness in Austria, and discuss implementation issues that may arise in similar settings, including the use of bandwidth selection algorithms and bias-correction procedures. Although recent developments in nonparametric estimation (Calonico, Cattaneo, & Farrell, 2014; Imbens & Kalyanaraman, 2012) are sometimes interpreted by practitioners as pointing to a default estimation procedure, we show that in any given application different procedures may perform better or worse. In particular, Monte Carlo simulations based on data-generating processes that closely resemble the data from our application show that some asymptotically dominant procedures may actually perform worse than “sub-optimal” alternatives in a given empirical application.

Abstract

Regression discontinuity designs have become popular in empirical studies due to their attractive properties for estimating causal effects under transparent assumptions. Nonetheless, most popular procedures assume i.i.d. data, which is unreasonable in many common applications. To fill this gap, we derive the properties of traditional local polynomial estimators in a fixed- G setting that allows for cluster dependence in the error term. Simulation results demonstrate that accounting for clustering in the data while selecting bandwidths may lead to lower MSE while maintaining proper coverage. We then apply our cluster-robust procedure to an application examining the impact of Low-Income Housing Tax Credits on neighborhood characteristics and low-income housing supply.

Abstract

This chapter develops a novel bootstrap procedure to obtain robust bias-corrected confidence intervals in regression discontinuity (RD) designs. The procedure uses a wild bootstrap from a second-order local polynomial to estimate the bias of the local linear RD estimator; the bias is then subtracted from the original estimator. The bias-corrected estimator is then bootstrapped itself to generate valid confidence intervals (CIs). The CIs generated by this procedure are valid under conditions similar to Calonico, Cattaneo, and Titiunik’s (2014) analytical correction – that is, when the bias of the naive RD estimator would otherwise prevent valid inference. This chapter also provides simulation evidence that our method is as accurate as the analytical corrections and we demonstrate its use through a reanalysis of Ludwig and Miller’s (2007) Head Start dataset.

Abstract

Identification in a regression discontinuity (RD) design hinges on the discontinuity in the probability of treatment when a covariate (assignment variable) exceeds a known threshold. If the assignment variable is measured with error, however, the discontinuity in the relationship between the probability of treatment and the observed mismeasured assignment variable may disappear. Therefore, the presence of measurement error in the assignment variable poses a challenge to treatment effect identification. This chapter provides sufficient conditions to identify the RD treatment effect using the mismeasured assignment variable, the treatment status and the outcome variable. We prove identification separately for discrete and continuous assignment variables and study the properties of various estimation procedures. We illustrate the proposed methods in an empirical application, where we estimate Medicaid takeup and its crowdout effect on private health insurance coverage.

Index

Pages 503-512
click here to view access options
Cover of Regression Discontinuity Designs
DOI
10.1108/S0731-9053201738
Publication date
2017-05-13
Book series
Advances in Econometrics
Editors
Series copyright holder
Emerald Publishing Limited
ISBN
978-1-78714-390-6
eISBN
978-1-78714-389-0
Book series ISSN
0731-9053