Regression Discontinuity Designs
ISBN: 978-1-78714-390-6, eISBN: 978-1-78714-389-0
Publication date: 13 May 2017
(2017), "Prelims", Regression Discontinuity Designs (Advances in Econometrics, Vol. 38), Emerald Publishing Limited, Bingley, pp. i-xxv. https://doi.org/10.1108/S0731-905320170000038020
Emerald Publishing Limited
Copyright © 2017 Emerald Publishing Limited
REGRESSION DISCONTINUITY DESIGNS: THEORY AND APPLICATIONS
ADVANCES IN ECONOMETRICS
Series Editors: Thomas B. Fomby, R. Carter Hill, Ivan Jeliazkov, Juan Carlos Escanciano, Eric Hillebrand, Daniel L. Millimet and Rodney Strachan
|Volume 29:||Essays in Honor of Jerry Hausman – Edited by Badi H. Baltagi, Whitney Newey, Hal White and R. Carter Hill|
|Volume 30:||30th Anniversary Edition – Edited by Dek Terrell and Daniel Millmet|
|Volume 31:||Structural Econometric Models – Edited by Eugene Choo and Matthew Shum|
|Volume 32:||VAR Models in Macroeconomics — New Developments and Applications: Essays in Honor of Christopher A. Sims – Edited by Thomas B. Fomby, Lutz Kilian and Anthony Murphy|
|Volume 33:||Essays in Honor of Peter C. B. Phillips – Edited by Thomas B. Fomby, Yoosoon Chang and Joon Y. Park|
|Volume 34:||Bayesian Model Comparison – Edited by Ivan Jeliazkov and Dale J. Poirier|
|Volume 35:||Dynamic Factor Models – Edited by Eric Hillebrand and Siem Jan Koopman|
|Volume 36:||Essays in Honor of Aman Ullah – Edited by Gloria González-Rivera, R. Carter Hill and Tae-Hwy Lee|
|Volume 37:||Spatial Econometrics: Qualitative and Limited Dependent Variables – Edited by Badi H. Baltagi, James P. LeSage and R. Kelley Pace|
ADVANCES IN ECONOMETRICS VOLUME 38
REGRESSION DISCONTINUITY DESIGNS: THEORY AND APPLICATIONS
MATIAS D. CATTANEO
Department of Economics and Department of Statistics, University of Michigan, Ann Arbor, MI, USA
JUAN CARLOS ESCANCIANO
Department of Economics, Indiana University, Bloomington, IN, USA
United Kingdom – North America – Japan – India – Malaysia – China
Emerald Publishing Limited
Howard House, Wagon Lane, Bingley BD16 1WA, UK
First edition 2017
Copyright © 2017 Emerald Publishing Limited
Reprints and permissions service
No part of this book may be reproduced, stored in a retrieval system, transmitted in any form or by any means electronic, mechanical, photocopying, recording or otherwise without either the prior written permission of the publisher or a licence permitting restricted copying issued in the UK by The Copyright Licensing Agency and in the USA by The Copyright Clearance Center. Any opinions expressed in the chapters are those of the authors. Whilst Emerald makes every effort to ensure the quality and accuracy of its content, Emerald makes no representation implied or otherwise, as to the chapters’ suitability and application and disclaims any warranties, express or implied, to their use.
British Library Cataloguing in Publication Data
A catalogue record for this book is available from the British Library
ISBN: 978-1-78714-390-6 (Print)
ISBN: 978-1-78714-389-0 (Online)
ISBN: 978-1-78714-729-4 (Epub)
ISSN: 0731-9053 (Series)
List of Contributors
|Otávio Bartalotti||Iowa State University, Ames, IA, USA|
|Quentin Brummet||United States Census Bureau, Washington, DC, USA|
|Gray Calhoun||Iowa State University, Ames, IA, USA|
|David Card||UC Berkeley, CA, USA|
|Giovanni Cerulli||IRCrES-CNR, Rome, Italy|
|Hanley Chiang||Mathematica Policy Research, Cambridge, MA, USA|
|Thomas D. Cook||Northwestern University, Evanston, IL, USA|
|Yingying Dong||UC Irvine, CA, USA|
|Brigham R. Frandsen||Brigham Young University, Provo, UT, USA|
|Sebastian Galiani||University of Maryland, College Park, MD, USA|
|Yang He||Iowa State University, Ames, IA, USA|
|Heinrich Hock||Mathematica Policy Research, Washington, DC, USA|
|Hugo Jales||Syracuse University, Syracuse, NY, USA|
|Luke Keele||Georgetown University, Washington, DC, USA|
|Yasemin Kisbu-Sakarya||Koc University, Istanbul, Turkey|
|Arthur Lewbel||Boston College, Chestnut Hill, MA, USA|
|David S. Lee||Princeton University, Princeton, NJ, USA|
|Scott Lorch||University of Pennsylvania, Philadelphia, PA, USA|
|Justin McCrary||UC Berkeley, CA, USA|
|Patrick J. McEwan||Wellesley College, Wellesley, MA, USA|
|Molly Passarella||University of Pennsylvania, Philadelphia, PA, USA|
|Zhuan Pei||Cornell University, Ithaca, NY, USA|
|Alexander Poulsen||Boston College, Boston, MA, USA|
|Jasjeet S. Sekhon||UC Berkeley, CA, USA|
|Yi Shen||University of Waterloo, Waterloo, ON, Canada|
|Dylan Small||University of Pennsylvania, Philadelphia, PA, USA|
|Yang Tang||Northwestern University, Evanston, IL, USA|
|Rocío Titiunik||University of Michigan, Ann Arbor, MI, USA|
|Brian Quistorff||Microsoft AI and Research, Redmond, WA, USA|
|Andrea Weber||Vienna University of Economics and Business, Austria|
|Zhengfei Yu||University of Tsukuba, Tsukuba, Japan|
The regression discontinuity (RD) design was introduced by Thistlethwaite and Campbell (1960) more than 50 years ago, but has gained immense popularity in the last decade. Nowadays, the design is well known and widely used in a variety of disciplines, including (but not limited to) most fields of study in the social, biomedical, behavioral, and statistical sciences. Many economists and other social scientists have devoted great efforts to advance the methodological knowledge and empirical practice concerning RD designs. Early reviews and historical perspectives are given by Cook (2008), Imbens and Lemieux (2008), and Lee and Lemieux (2010), but much progress has taken place since then. This volume of Advances in Econometrics seeks to contribute to this rapidly expanding RD literature by bringing together theoretical and applied econometricians, statisticians, and social, behavioral, and biomedical scientists, in the hope that these interactions will further spark innovative practical developments in this important and active research area.
This volume collects 12 innovative and thought-provoking contributions to the RD literature, covering a wide range of methodological and practical topics. Many of these chapters touch on foundational methodological issues such as identification and interpretation, implementation, falsification testing, or estimation and inference, while others focus on more recent and related topics such as identification and interpretation in a discontinuity-in-density framework, empirical structural estimation, comparative RD methods, and extrapolation. Considered together, these chapters will help shape methodological and empirical research currently employing RD designs, in addition to providing new bases and frameworks for future work in this area.
The following sections provide a more detailed discussion of the 12 contributions forming this volume of Advances in Econometrics. To this end, we first give a very brief overview of the state-of-the-art in the analysis and interpretation of RD designs by offering a succinct account of the RD literature. Although our overview covers a large number of classical and recent papers, it is surely incomplete, as this literature continues to grow and expand rapidly. Our goal here is not to provide a comprehensive review of the literature, but rather to set the ground for describing how each of the contributions in this volume fits in the broader RD literature.
Overview of the Literature
The RD design is arguably one of the most credible and internally valid non-experimental research designs in observational studies and program evaluation. The key distinctive features underlying all RD designs are that, for each unit under study, (i) treatment is assigned based on an observed variable , usually called running variable, score or index, and (ii) the conditional probability of treatment status, which equals the probability of treatment assignment under perfect compliance, changes abruptly or discontinuously at a known cutoff value on the support of the running variable. Therefore, in RD designs, treatment assignment occurs via hard-thresholding: each unit is assigned to the control group if , and to treatment group if . The most standard RD setting also assumes that the running variable is continuously distributed in a neighborhood of the cutoff value, with a positive density. In this canonical RD framework, the two basic parameters of interest are the average treatment effect at the cutoff (interpreted as an intention-to-treat parameter under non-compliance), and the probability limit of a two-stage treatment effect estimator at the cutoff when compliance is imperfect (interpreted as a local average treatment effect at the cutoff under additional assumptions). Most popular estimation and inference methods in applied work rely on local polynomial regression techniques based on large sample approximations.
Many departures from the canonical RD design have been proposed in the literature, spanning a wide range of possibilities. For example, researchers have considered different RD designs (e.g., multi-cutoff RD or geographic RD), different population parameters (e.g., kink RD or distributional RD), different estimators and inference procedures (e.g., randomization inference or empirical likelihood), and even different departures from the underlying canonical assumptions (e.g., measurement error or discretely valued running variable). Furthermore, many new methodologies have been developed in recent years covering related problems such as graphical presentation techniques, falsification/validation methods, and treatment effect extrapolation approaches.
Our succinct overview of the classical and recent literature on RD designs is organized in four main categories: (i) Identification, Interpretation, and Extrapolation; (ii) Presentation, Falsification, and Robustness Checks; (iii) Estimation and Inference; and (iv) Software. We then summarize and discuss the new contributions in this volume of Advances in Econometrics by placing them in context relative to these four categories and the associated references.
Finally, a large list of references to empirical applications employing RD designs may be found in Lee and Lemieux (2010), Cattaneo, Keele, Titiunik, and Vazquez-Bare (2016, supplemental appendix), and references therein.
Identification, Interpretation, and Extrapolation
Hahn, Todd, and van der Klaauw (2001) were the first to formally discuss identification of average treatment effects at the cutoff in the so-called Sharp and Fuzzy RD designs, that is, in RD settings with perfect and imperfect treatment compliance, respectively. They employed the potential outcomes framework to analyze the RD design, and gave conditions based on continuity of conditional expectation functions at the cutoff, guaranteeing large sample identification of the treatment effect parameters of interest. Lee (2008) also studied identification in sharp RD designs, focusing on the interpretation of the estimand in a context where imperfect manipulation of the running variable prevents units from precisely sorting around the cutoff determining treatment assignment. In his imperfect manipulation setting, Lee established continuity of conditional expectations and distribution functions, and offered an heuristic interpretation of RD designs as local randomized experiments. Together, these two cornerstone contributions provided general frameworks for analyzing and interpreting RD designs, which led to widespread methodological innovation in the RD literature.
Building on the above potential outcomes frameworks, and therefore focusing on large sample identification of average treatment effects at the cutoff via continuity assumptions on conditional expectations, more recent work has studied identification and interpretation of treatment effects in other RD designs. For example, Papay, Willett, and Murnane (2011) focus on RD designs with two or more running variables, Keele and Titiunik (2015) analyze geographic RD designs, Card, Lee, Pei, and Weber (2015) study regression kink designs (giving a causal interpretation to kink RD designs), Chiang and Sasaki (2016) focus on quantile kink RD designs, Cattaneo et al. (2016) investigate RD designs with multiple cutoffs, Choi and Lee (2016) consider interactions and partial effects in RD settings with two running variables, and Caetano and Escanciano (2015) exploit the presence of additional covariates to identify RD marginal effects. See also Calonico, Cattaneo, Farrell, and Titiunik (2016) for a discussion of the potential benefits and pitfalls of employing additional pre-intervention covariates in the RD design. Many other empirical problems are at present being placed in the context of, or formally connected to, different variants of the RD design.
Cattaneo, Frandsen, and Titiunik (2015) present an alternative causal framework to analyze RD designs, introducing and formalizing the notation of local randomization. This framework is conceptually and methodological distinct from the more standard continuity-based framework employed by the papers discussed previously. In their local randomization framework, the goal is to formalize the idea of a local randomized experiment near the cutoff by embedding the RD design in a classical, Fisherian causal model, thereby giving interpretation and justification to randomization inference and related classical experimental methods. This alternative approach was later extended by Cattaneo, Titiunik, and Vazquez-Bare (2017a), where methodological and empirical comparisons between the two causal inference frameworks (continuity and local randomization) are also given.
Finally, a very recent strand of the RD literature has focused on the important question of extrapolation. It is by now well recognized that an important limitation of modern identification approaches at or near the cutoff is that the resulting estimates and inference results are not easily transferable to other populations beyond those having running variables near the cutoff. There are now a few recent papers trying to address this issue: Angrist and Rokkanen (2015) employ a local conditional independence assumption to discuss extrapolation via variation in observable characteristics, Dong and Lewbel (2015) look at local extrapolation via marginal treatment effects and an exclusion restriction in a continuity-based RD framework, Cattaneo et al. (2016) exploit variation in multiple cutoffs to extrapolate RD treatment effects also using an exclusion restriction in a continuity-based RD framework, Bertanha and Imbens (2016) exploit variation induced by imperfect compliance in fuzzy RD designs, Cattaneo, Keele, Titiunik, and Vazquez-Bare (2017) exploit variation in multiple cutoffs but allowing for possible selection into cutoffs, and Rokkanen (2016) employs a factor model for extrapolation of RD treatment effects.
Presentation, Falsification, and Robustness Checks
One of the main virtues of the RD design is that it can be easily and intuitively presented and falsified in empirical work. Automatic, optimal graphical presentation via RD Plots is discussed and formally studied in Calonico, Cattaneo, and Titiunik (2015a). These recent methods offer graphical tools for summarizing the RD design as well as for informally testing its plausibility, which can also be done formally using some of the estimation and inference methods discussed further below.
McCrary (2008) proposed a very interesting and creative falsification method specifically tailored to RD designs. This falsification test looks at whether there is a discontinuity in the density of the running variable near the cutoff, the presence of which is interpreted as evidence of “manipulation” or “sorting” of units around the cutoff. This test is implemented empirically by comparing the estimated densities of the running variable for control and treatment units separately. McCrary’s originally implementation used smoothed-out histogram estimators via local polynomial techniques. More recently, Otsu, Xu, and Matsushita (2014) proposed a density test based on empirical likelihood methods, and Cattaneo, Jansson, and Ma (2016a) developed a density test based on a novel local polynomial density estimator that avoids preliminary tuning parameter choices.
Another more standard, but also quite common, falsification approach in RD designs looks at whether there is a null RD treatment effect on pre-intervention covariates or placebo outcomes. The presence of a non-zero RD treatment effect on such variables would provide evidence against the design. This idea follows standard practices in the analysis of experiments, and was first formalized in a continuity-based framework by Lee (2008). Any estimation and inference method for RD designs can be used to implement this falsification approach, employing the pre-intervention covariate or placebo outcome as the outcome variable. For example, the robust bias-corrected local polynomial methods of Calonico, Cattaneo, and Titiunik (2014b) and local randomization methods of Cattaneo et al. (2015) are readily applicable, as well as other methods, all briefly discussed below. As a complement to these estimation and inference methods, Canay and Kamat (2016) recently developed a permutation testing approach for equality of control and treatment distributions, and Ganong and Jäger (2016) also recently developed a different permutation-based approach for kink RD designs.
Estimation and Inference
Local polynomial methods are by now widely accepted as the default technique for the analysis of RD designs. Global polynomial regressions are useful for presentation and graphical analysis (Calonico, Cattaneo, & Titiunik, 2015a), but not recommended for actual estimation and inference of RD treatment effects (Gelman & Imbens, 2014). See also Wing and Cook (2013) for a related discussion of parametric methods in RD designs.
For point estimation purposes, conventional local polynomial methods were originally suggested by Hahn et al. (2001), and later Porter (2003) provided an in-depth large sample analysis in the specific RD context. Building on this work, Imbens and Kalyanaraman (2012) developed mean-squared-error (MSE) optimal bandwidth selectors for local-linear RD estimators in sharp and fuzzy designs. Employing this MSE-optimal bandwidth selector when implementing the corresponding local polynomial estimator gives an MSE-optimal RD treatment effect estimator, which is commonly used in modern empirical work.
For inference purposes, Calonico, Cattaneo, and Titiunik (2014b, CCT hereafter) pointed out that the MSE-optimal local polynomial point estimator cannot be used for constructing confidence intervals in RD designs – or for conducting statistical inference more generally – because of the presence of a first-order misspecification bias. CCT developed new robust bias-corrected inference methods, based on both removing the first-order misspecification bias present in the MSE-optimal RD estimator and adjusting the standard errors accordingly to account for the additional variability introduced by the bias correction. This new method of nonparametric inference for RD designs works very well in simulations, and was also shown to deliver uniformly valid inference (Kamat, 2017) as well as higher-order refinements (Calonico, Cattaneo, & Farrell, 2017a, 2017b). In addition, Calonico et al. (2017a) develop new bandwidth selection procedures specifically tailored to constructing confidence intervals with small coverage errors in RD designs. See Cattaneo and Vazquez-Bare (2016) for an accessible discussion on bandwidth selection and related neighborhood selection methods.
More recently, Calonico et al. (2016) studied identification, estimation and inference of average RD treatment effects when additional pre-intervention covariates are also included in the local polynomial estimation. This paper develops new optimal bandwidth selectors and valid robust bias-corrected inference methods valid under both heteroskedasticity and clustering in the data.
As an alternative to local polynomial methods, researchers also employ flexible methods near the cutoff. This approach is usually justified by assuming some form of local randomization or similar assumption for some neighborhood near the cutoff. Building on this intuitive and commonly employed approach, Cattaneo et al. (2015) and Cattaneo et al. (2017a) present a formal local randomization framework for RD designs employing ideas and methods from the classical analysis of experiments literature. For estimation and inference, Neyman’s and Fisher’s methods are introduced and developed for RD designs, though Fisherian inference (also known as randomization inference) is recommended due to the likely small sample sizes encountered in the neighborhoods near the cutoff where the local randomization assumption is most plausible. Keele, Titiunik, and Zubizarreta (2015) apply these ideas to geographic RD designs, combining them with a “matching” algorithm to incorporate pre-intervention covariates.
The methods above focus on estimation and inference of average treatment effects at or near the cutoff, under either a continuity-based or randomization-based framework. There are, of course, other methods (and parameters) of potential interest in the RD literature. For example, Otsu, Xu, and Matsushita (2015) discuss empirical likelihood methods for average treatment effects at the cutoff, Shen and Zhang (2016) discuss local polynomial methods for distributional treatment effects at the cutoff, Xu (2016) considers local polynomial methods for limited dependent outcome variable models near the cutoff, Bertanha (2017) considers estimation and inference of different average treatment effects in a multi-cutoff RD design, and Armstrong and Kolesar (2016a, 2016b) discuss nonparametric confidence interval estimation for the sharp average treatment effect at the cutoff. All these contributions employ a continuity-based framework at the cutoff, and therefore employ large sample approximations. In addition to the local randomization framework discussed above, another finite sample framework for the analysis of RD designs was recently introduced by Chib and Jacobi (2016), who employ Bayesian methods in the context of fuzzy RD designs.
Last but not least, some recent research has focused on different departures from the canonical assumptions employed for methodological and practical research. For example, Lee and Card (2008) study RD designs where the running variable is discrete and the researcher employs linear regression extrapolation to the cutoff, Dong (2015) focuses on RD settings where the underlying running variable is continuous but the researcher only observes a discretized version, Lee (2017) studies the issue of classical measurement error in the running variable, Feir, Lemieux, and Marmer (2016) explore the consequences of having weak instruments in the context of fuzzy RD designs, and Dong (2017) studies the implications of non-random sample selection near the cutoff.
Many of the methodological and practical contributions mentioned above are readily available in general purpose software in R and Stata, while other contributions previously discussed and many of the contributions included in this volume can also be implemented using already available software. Calonico, Cattaneo, and Titiunik (2014a, 2015b) and Calonico, Cattaneo, Farrell, and Titiunik (2017) give a comprehensive introduction to software implementing RD methods based on partitioning and local polynomial techniques, covering RD Plots, bandwidth selection, estimation and inference, and many other possibilities. Cattaneo, Jansson, and Ma (2016b) discuss software implementing discontinuity-in-density tests. Cattaneo, Titiunik, and Vazquez-Bare (2016) give a comprehensive introduction to software implementing RD methods based on a local randomization assumption, building on the classical analysis of experiments literature as well as more recent related developments. Finally, Cattaneo, Titiunik, and Vazquez-Bare (2017b) discuss power calculation and survey sample selection for RD designs based on local polynomial estimation and inference methods.
This R and Stata software is available at https://sites.google.com/site/rdpackages.
Contributions in this Volume
This volume of Advances in Econometrics includes 12 outstanding chapters on methodology and applications using RD designs. We now offer a brief overview of each of these contributions, and discuss how they fit into the RD literature presented previously.
Identification, Interpretation, and Extrapolation
The first six contributions in this volume are related to fundamental issues of identification, interpretation and extrapolation in RD designs. The first chapter, by Sekhon and Titiunik, discusses the connections and discrepancies between the continuity-based and randomization-based RD frameworks – the two main paradigms for the analysis and interpretation of RD designs. The authors introduce the different concepts in a familiar setting where potential outcomes are random (as opposed to being fixed as in the classical analysis of experiments literature), and then discuss at length the issues and features of each of the two most popular conceptual frameworks in RD designs. This chapter not only clarifies the underlying conditions many times implicitly imposed in each of these frameworks, but also gives the reader a unique opportunity to appreciate some of the underlying key differences between them.
The second chapter, by Jales and Yu, is truly thought-provoking. The authors introduce and discuss ideas of identification and interpretation in settings where a continuous (running) variable exhibits a discontinuity in its probability density function. They not only review several recent empirical papers where such a situation arises naturally, but also discuss in great detail how this reduced form feature can be used to identify useful parameters in several seemingly unrelated economic models. This chapter introduces the reader to these ideas and, perhaps more importantly, offers a general framework for analysis of economic situations where discontinuities in density functions are present. This contribution will surely spike further methodological research, both on identification as well as on estimation and inference.
The third chapter, by Lee and McCrary, provides another intellectually stimulating instance where identification and interpretation in RD designs can be naturally enhanced by employing economic theory. This outstanding chapter not only was (when originally written) one of the first to report a credible zero causal treatment effect of incarceration on recidivism, but also provides two remarkable and highly innovative methodological contributions. First, it illustrates how modern methodology in RD designs can be successfully adapted to incorporate the specific features of the empirical problem at hand (i.e., sample selection and non-random censoring). Second, it shows how an economic model can be used, together with reduced form estimates from the RD design, to estimate interesting and useful structural parameters, thereby offering a tight connection between reduced form and structural methods in RD contexts.
The fourth and fifth chapters in this volume are closely related to each other, both focusing on different aspects of geographic RD designs. The chapter by Keele, Lorch, Passarella, Small, and Titiunik offers an overview of research designs based on a geographically discontinuous treatment assignment leading to adjacent treated and untreated areas. The authors discuss how the availability of geo-referenced data affects the ability of researchers to employ this type of design in a pure (two-dimensional) RD framework. When researchers have access to the exact geographic location of each individual observation, the geographically discontinuous treatment assignment can be analyzed in a standard RD setup. In contrast, when information about geographic location is only available for aggregate units, these designs are better analyzed as RD designs with discrete running variables, if the aggregate units are sufficiently small, or otherwise as geographic “quasi-experiments,” possibly after controlling for observable characteristics. The discussion and underlying issues are illustrated with an empirical application, which shows some of the acute internal validity challenges that are typical in research designs based on geographically discontinuous treatments (e.g., treated and control units continue to have systematic differences even after adjusting for observables or considering only geographically close units).
The chapter by Galiani, McEwan, and Quistorff illustrates similar internal validity challenges in geographic-quasi experiments, and also discusses challenges related to their external validity. To study both types of threats, the authors use data from an experimental study in development economics as benchmark. Their empirical study focuses on various geographic designs that compare treated units close to municipal borders to both experimental and non-experimental untreated groups. This analysis shows that the geographic quasi-experiment is unable to recover the experimental benchmark. This is related to both internal and external validity threats. First, there is empirical evidence of location-based sorting on observed (and possibly unobserved) variables, as treated and control units appear systematically different in at least one important covariate – this raises concerns about internal validity. Second, the exclusion of units far from the border in the geographic-quasi experiment is shown to lead to a covariate distribution that differs from the covariate distribution in the experimental sample. Because some of these covariates are potential moderators of the treatment effect, this raises concerns about the external validity of the geographic quasi-experiment effect. In sum, the discussion and results in Keele et al. and Galiani et al. suggest that research designs based on geographically discontinuous treatments offer exciting opportunities to evaluate policies and programs, but they are also vulnerable to considerable internal and external validity challenges, setting the ground for much needed future research in this area.
The sixth contribution, by Tang, Cook, Kisbu-Sakarya, Hock, and Chiang, focuses on the Comparative RD design, a recently introduced methodology that incorporates a placebo outcome variable to improve extrapolation of average treatment effects in sharp RD designs, in addition to aiding parametric estimation and inference. The authors present an insightful review of this novel methodology, and also illustrate its main practical features by employing an empirical application with an underlying randomized controlled trial component. This new methodology employs global parametric methods coupled with an outcome variable unaffected by treatment but observed over the full support of the running variable, to improve efficiency in parametric estimation and extrapolation in RD designs. In their empirical application, the Comparative RD design methodology performs well when compared to the results from the randomized controlled trial component.
Presentation, Falsification, and Robustness Checks
Two chapters in this volume are related to falsification and robustness checks in RD designs. The chapter by Frandsen investigates how the idea underlying the widely used McCrary’s density discontinuity test for manipulation can be adapted and employed in settings where the running variable is discrete. This is a very important question, as many RD designs employ discrete running variables. The author develops a new manipulation test that employs finite sample distributional methods and is justified via large sample approximations and bounds on the underlying smoothness of unknown functions. This novel manipulation test complements existing tests, most of which are only valid when the running variable is continuously distributed, as well as the simple binomial tests also widely used in practice.
A second contribution in this volume to robustness checks in RD designs (and, by implication, extrapolation) is the chapter by Cerulli, Dong, Lewbel, and Poulsen. The authors introduce and discuss a new test for local stability of RD treatment effects. In particular, this chapter proposes to test for zero slope change in the average treatment effect at the cutoff, which is effectively equivalent to testing for a null kink RD treatment effect. The authors then argue that, whenever there is no change in the treatment effect of interest relative to the running variable near the cutoff, the usual sharp treatment effect is more stable and hence provides a more global result for units near the cutoff. A key feature of this idea is that it can be implemented quite easily using available modern methods for RD estimation and inference, which will surely contribute to the popularity of this test in empirical applications.
Estimation and Inference
The last four chapters in this volume focus attention on different aspects of estimation and inference in RD designs. In all cases, these chapters take a continuity-based approach, employ local polynomial methods, and either assess the empirical properties of recently proposed methods in the literature or develop new methods in practically relevant settings.
First, the chapter by Card, Lee, Pei, and Weber offers an insightful and thorough empirical study of the finite sample properties of the robust bias-corrected inference methods proposed by Calonico, Cattaneo, and Titiunik (2014b, CCT) in the context of regression kink designs (and, more generically, kink RD designs). Their paper offers several valuable lessons for practitioners hoping to employ the most recent methodological innovations in the RD design literature. In particular, the authors bring attention to issues related to (i) choice of polynomial order, (ii) bandwidth selection methods, and (iii) potential lack of precision of robust methods. These findings are not only important for empirical work, but also set the ground for future research and further methodological improvements.
Second, the chapter by Bartalotti and Brummet studies bandwidth selection for point estimation and inference when robust bias-correction methods are used, in a setting where generic clustering among units is possibly present. Building on CCT’s recent work under random sampling, the authors develop a new MSE expansion for sharp RD designs under general clustering, and employ this approximation to obtain a new MSE-optimal bandwidth under clustered data. This bandwidth choice is different from the standard MSE-optimal choice obtained under random sampling, and can be used to construct an MSE-optimal RD local polynomial point estimator under general clustering. The authors also discuss the special case of clustering at the running variable level, which is common in empirical work and leads to important simplifications in the methodology. These new methods are highly relevant and very useful for empirical work employing RD designs.
Third, the chapter by Bartalotti, Calhoun, and He introduces a bootstrap inference method based on robust bias-correction techniques. Building on CCT’s robust bias-correction approach, the authors develop a double wild bootstrap method where the first layer of bootstrap is used to approximate the misspecification bias and the second-layer is used to compute valid variance and distributional approximations taking into account the bias-correction first step. The authors also show that the first bootstrap layer gives a bias estimate that is equivalent to the analytic bias-correction proposed by CCT, up to simulation error. These results are not only useful for empirical work (i.e., they provide an alternative way of implementing CCT robust bias-correction methods), but also open the door for future research connecting bootstrapping methods and bias-correction in other RD designs settings (e.g., with clustered data or when including additional covariates).
Last but not least, the chapter by Pei and Shen studies RD settings where the running variable is measured with error, and provides alternative sufficient conditions guaranteeing identifiability of RD treatment effects when estimated using the mismeasured assignment variable, the treatment status, and the outcome variable. The authors study RD settings where the running variable is either discrete or continuous, thereby offering quite a complete analysis with wide applicability for empirical work. These results contribute to a recent literature on this topic, and more generally to the literature on departures from canonical assumptions in RD designs, briefly summarized above. The issue of mismeasured running variables is quite important in practice, and this chapter not only offers a clear introduction to this important problem, but also sets a framework for the analysis and interpretation of RD designs with measurement error. This chapter will surely motivate future work in this important research area.
Angrist & Rokkanen (2015) Angrist, J. , & Rokkanen, M. (2015). Wanna get away? Regression discontinuity estimation of exam school effects away from the cutoff. Journal of the American Statistical Association, 110(512), 1331–1344.
Armstrong & Kolesar (2016a) Armstrong, T. B. , & Kolesar, M. (2016a). Optimal inference in a class of regression models. arXiv:1511.06028.
Armstrong & Kolesar (2016b) Armstrong, T. B. , & Kolesar, M. (2016b). A simple adjustment for bandwidth snooping. arXiv:1412.0267.
Bertanha (2017) Bertanha, M. (2017). Regression discontinuity design with many thresholds. Working Paper, University of Notre Dame.
Bertanha & Imbens (2016) Bertanha, M. , & Imbens, G. W. (2016). External validity in fuzzy regression discontinuity designs. Working Paper No. 20773, National Bureau of Economic Research.
Caetano & Escanciano (2015) Caetano, C. , & Escanciano, J. C. (2015). Identifying multiple marginal effects with a single binary instrument or by regression discontinuity. Working Paper, Indiana University.
Calonico, Cattaneo, & Farrell (2017a) Calonico, S. , Cattaneo, M. D. , & Farrell, M. H. (2017a). Coverage error optimal confidence intervals for regression discontinuity designs. Working paper, University of Michigan.
Calonico, Cattaneo, & Farrell (2017b) Calonico, S. , Cattaneo, M. D. , & Farrell, M. H. (2017b). On the effect of bias estimation on coverage accuracy in nonparametric inference. Journal of the American Statistical Association, forthcoming.
Calonico, Cattaneo, Farrell, & Titiunik (2016a) Calonico, S. , Cattaneo, M. D. , Farrell, M. H. , & Titiunik, R. (2016). Regression discontinuity designs using covariates. Working Paper, University of Michigan.
Calonico, Cattaneo, Farrell, & Titiunik (2017) Calonico, S. , Cattaneo, M. D. , Farrell, M. H. , & Titiunik, R. (2017). rdrobust: Software for regression discontinuity designs. Stata Journal, forthcoming.
Calonico, Cattaneo, & Titiunik (2014a) Calonico, S. , Cattaneo, M. D. , & Titiunik, R. (2014a). Robust data-driven inference in the regression-discontinuity design. Stata Journal, 14(4), 909–946.
Calonico, Cattaneo, & Titiunik (2014b) Calonico, S. , Cattaneo, M. D. , & Titiunik, R. (2014b). Robust nonparametric confidence intervals for regression-discontinuity designs. Econometrica, 82(6), 2295–2326.
Calonico, Cattaneo, & Titiunik (2015a) Calonico, S. , Cattaneo, M. D. , & Titiunik, R. (2015a). Optimal data-driven regression discontinuity plots. Journal of the American Statistical Association, 110(512), 1753–1769.
Calonico, Cattaneo, & Titiunik (2015b) Calonico, S. , Cattaneo, M. D. , & Titiunik, R. (2015b). rdrobust: An R package for robust nonparametric inference in regression-discontinuity designs. R Journal, 7(1), 38–51.
Canay & Kamat (2016) Canay, I. A. , & Kamat, V. (2016). Approximate permutation tests and induced order statistics in the regression discontinuity design. Working Paper, Northwestern University.
Card, Lee, Pei, & Weber (2015) Card, D. , Lee, D. S. , Pei, Z. , & Weber, A. (2015). Inference on causal effects in a generalized regression kink design. Econometrica, 83(6), 2453–2483.
Cattaneo, Frandsen, & Titiunik (2015) Cattaneo, M. D. , Frandsen, B. , & Titiunik, R. (2015). Randomization inference in the regression discontinuity design: An application to party advantages in the U.S. Senate. Journal of Causal Inference, 3(1), 1–24.
Cattaneo, Jansson, & Ma (2016b) Cattaneo, M. D. , Jansson, M. , & Ma, X. (2016). rddensity: Manipulation testing in stata. Working Paper, University of Michigan.
Cattaneo, Jansson, & Ma (2016a) Cattaneo, M. D. , Jansson, M. , & Ma, X. (2016a). Simple local regression distribution estimators with an application to manipulation testing. Working Paper, University of Michigan.
Cattaneo, Keele, Titiunik, & Vazquez-Bare (2016b) Cattaneo, M. D. , Keele, L. , Titiunik, R. , & Vazquez-Bare, G. (2016). Interpreting regression discontinuity designs with multiple cutoffs. Journal of Politics, 78(4), 1229–1248.
Cattaneo, Keele, Titiunik, & Vazquez-Bare (2017) Cattaneo, M. D. , Keele, L. , Titiunik, R. , & Vazquez-Bare, G. (2017). Extrapolating treatment effects in multi-cutoff regression discontinuity designs. Working Paper, University of Michigan.
Cattaneo, Titiunik, & Vazquez-Bare (2016) Cattaneo, M. D. , Titiunik, R. , & Vazquez-Bare, G. (2016). Inference in regression discontinuity designs under local randomization. Stata Journal, 16(2), 331–367.
Cattaneo, Titiunik, & Vazquez-Bare (2017a) Cattaneo, M. D. , Titiunik, R. , & Vazquez-Bare, G. (2017a). Comparing inference approaches for RD designs: A reexamination of the effect of head start on child mortality. Journal of Policy Analysis and Management, forthcoming.
Cattaneo, Titiunik, & Vazquez-Bare (2017b) Cattaneo, M. D. , Titiunik, R. , & Vazquez-Bare, G. (2017b). Power calculations for regression discontinuity designs. Working Paper, University of Michigan.
Cattaneo & Vazquez-Bare (2016) Cattaneo, M. D. , & Vazquez-Bare, G. (2016). The choice of neighborhood in regression discontinuity designs. Observational Studies, 2, 134–146.
Chiang & Sasaki (2016) Chiang, H. D. , & Sasaki, Y. (2016). Causal inference by quantile regression kink designs. arXiv:1605.09773.
Chib & Jacobi (2016) Chib, S. , & Jacobi, L. (2016). Bayesian fuzzy regression discontinuity analysis and returns to compulsory schooling. Journal of Applied Econometrics, 31(6), 1026–1047.
Choi & Lee (2016) Choi, J. , & Lee, M. (2016). Regression discontinuity with multiple running variables allowing partial effects. Working Paper.
Cook (2008) Cook, T. D. (2008). “Waiting for Life to Arrive”: A history of the regression-discontinuity design in psychology, statistics and economics. Journal of Econometrics, 142(2), 636–654.
Dong (2015) Dong, Y. (2015). Regression discontinuity applications with rounding errors in the running variable. Journal of Applied Econometrics, 30(3), 422–446.
Dong (2016) Dong, Y. (2017). Regression discontinuity designs with sample selection. Journal of Business and Economic Statistics, forthcoming.
Dong & Lewbel (2015) Dong, Y. , & Lewbel, A. (2015). Identifying the effect of changing the policy threshold in regression discontinuity models. Review of Economics and Statistics, 97(5), 1081–1092.
Feir, Lemieux, & Marmer (2016) Feir, D. , Lemieux, T. , & Marmer, V. (2016). Weak identification in fuzzy regression discontinuity designs. Journal of Business & Economic Statistics, 34(2), 185–196.
Ganong & Jäger (2016) Ganong, P. , & Jäger, S. (2016). A permutation test for the regression kink design. Working Paper, Harvard University.
Gelman & Imbens (2014) Gelman, A. , & Imbens, G. W. (2014). Why high-order polynomials should not be used in regression discontinuity designs. NBER Working Paper No. 20405.
Hahn, Todd, & van der Klaauw (2001) Hahn, J. , Todd, P. , & van der Klaauw, W. (2001). Identification and estimation of treatment effects with a regression-discontinuity design. Econometrica, 69(1), 201–209.
Imbens & Lemieux (2008) Imbens, G. , & Lemieux, T. (2008). Regression discontinuity designs: A guide to practice. Journal of Econometrics, 142(2), 615–635.
Imbens & Kalyanaraman (2012) Imbens, G. W. , & Kalyanaraman, K. (2012). Optimal bandwidth choice for the regression discontinuity estimator. Review of Economic Studies, 79(3), 933–959.
Kamat (2017) Kamat, V. (2017). On nonparametric inference in the regression discontinuity design. Econometric Theory, forthcoming.
Keele & Titiunik (2015) Keele, L. J. , & Titiunik, R. (2015). Geographic boundaries as regression discontinuities. Political Analysis, 23(1), 127–155.
Keele, Titiunik, & Zubizarreta (2015) Keele, L. J. , Titiunik, R. , & Zubizarreta, J. (2015). Enhancing a geographic regression discontinuity design through matching to estimate the effect of ballot initiatives on voter turnout. Journal of the Royal Statistical Society: Series A, 178(1), 223–239.
Lee (2008) Lee, D. S. (2008). Randomized experiments from non-random selection in U.S. house elections. Journal of Econometrics, 142(2), 675–697.
Lee & Card (2008) Lee, D. S. , & Card, D. (2008). Regression discontinuity inference with specification error. Journal of Econometrics, 142(2), 655–674.
Lee & Lemieux (2010) Lee, D. S. , & Lemieux, T. (2010). Regression discontinuity designs in economics. Journal of Economic Literature, 48(2), 281–355.
Lee (2017) Lee, M.-J. (2017). Regression discontinuity with errors in the running variable: Effect on truthful margin. Journal of Econometric Methods, 6(1–8), 281–355.
McCrary (2008) McCrary, J. (2008). Manipulation of the running variable in the regression discontinuity design: A density test. Journal of Econometrics, 142(2), 698–714.
Otsu, Xu, & Matsushita (2014) Otsu, T. , Xu, K.-L. , & Matsushita, Y. (2014). Estimation and inference of discontinuity in density. Journal of Business and Economic Statistics, 31(4), 507–524.
Otsu, Xu, & Matsushita (2015) Otsu, T. , Xu, K.-L. , & Matsushita, Y. (2015). Empirical likelihood for regression discontinuity design. Journal of Econometrics, 186(1), 94–112.
Papay, Willett, & Murnane (2011) Papay, J. P. , Willett, J. B. , & Murnane, R. J. (2011). Extending the regression-discontinuity approach to multiple assignment variables. Journal of Econometrics, 161(2), 203–207.
Porter (2003) Porter, J. (2003). Estimation in the regression discontinuity model. Working Paper, University of Wisconsin.
Rokkanen (2016) Rokkanen, M. (2016). Exam schools, ability, and the effects of affirmative action: Latent factor extrapolation in the regression discontinuity design. Working Paper, Columbia University.
Shen & Zhang (2016) Shen, S. , & Zhang, X. (2016). Distributional regression discontinuity: Theory and applications. Review of Economics and Statistics, 98(4), 685–700.
Thistlethwaite & Campbell (1960) Thistlethwaite, D. L. , & Campbell, D. T. (1960). Regression-discontinuity analysis: An alternative to the ex-post facto experiment. Journal of Educational Psychology, 51(6), 309–317.
Wing & Cook (2013) Wing, C. , & Cook, T. D. (2013). Strengthening the regression discontinuity design using additional design elements: A within-study comparison. Journal of Policy Analysis and Management, 32(4), 853–877.
Xu (2016) Xu, K.-L. (2016). Regression discontinuity with categorical outcomes. Working Paper, Indiana University.
We thank Sebastian Calonico, Max Farrell, Michael Jansson, Xinwei Ma, Kenichi Nagasawa, Rocío Titiunik, and Gonzalo Vazquez-Bare for many enlightening and insightful discussions over the years. Cattaneo gratefully acknowledges financial support from the National Science Foundation through grants SES-1357561 and SES-1459931.
As part of this volume preparation, a research conference was held at the University of Michigan on May 19–20, 2016. Details on this conference can be found online (https://sites.google.com/site/aie38rdd/conference). Cattaneo thanks MITRE at the University of Michigan’s Department of Economics and the National Science Foundation (through grants SES-1357561 and SES-1459931) for providing financial and logistical support, without which the Advances in Econometrics conference would have not been possible. In addition, we thank Alberto Abadie, Josh Angrist, Martha Bailey, Michael Jansson, Pat Kline, and Jeff Wooldridge for agreeing to serve in the conference scientific committee. We also thank Alberto Abadie, Tim Armstrong, Martha Bailey, Quentin Brummet, Federico Bugni, Gray Calhoun, Sebastian Calonico, Otavio Bartalotti, Max Farrell, Thomas Fomby, Margherita Fort, Brigham Frandsen, Sebastian Galiani, Andreas Hagemann, Nicolas Idrobo, Hugo Jales, Michael Jansson, Luke Keele, Lutz Kilian, Michal Kolesar, Arthur Lewbel, Xinwei Ma, Justin McCrary, Patrick McEwan, Walter Mebane, Jose Luis Montiel Olea, Kenichi Nagasawa, Zhuan Pei, Andres Santos, Rocío Titiunik, Gonzalo Vazquez-Bare, and Tim Vogelsang for attending and providing excellent feedback during the conference.
Last but not least, we are also very thankful to many of the scholars listed above, who generously agreed to serve as anonymous reviewers of the chapters published in this volume. Their invaluable and constructive comments and suggestions certainly improved the overall quality of this volume.
Matias D. Cattaneo
Juan Carlos Escanciano
- On Interpreting the Regression Discontinuity Design as a Local Experiment
- Identification and Estimation Using a Density Discontinuity Approach
- The Deterrence Effect of Prison: Dynamic Theory and Evidence
- An Overview of Geographically Discontinuous Treatment Assignments with an Application to Children’s Health Insurance
- External and Internal Validity of a Geographic Quasi-Experiment Embedded in a Cluster-Randomized Experiment
- The Comparative Regression Discontinuity (CRD) Design: An Overview and Demonstration of its Performance Relative to Basic RD and the Randomized Experiment
- Party Bias in Union Representation Elections: Testing for Manipulation in the Regression Discontinuity Design when the Running Variable is Discrete
- Testing Stability of Regression Discontinuity Models
- Regression Kink Design: Theory and Practice
- Regression Discontinuity Designs with Clustered Data
- Bootstrap Confidence Intervals for Sharp Regression Discontinuity Designs
- The Devil is in the Tails: Regression Discontinuity Design with Measurement Error in the Assignment Variable