Applied econometric analysis is often performed using data collected from large-scale surveys. These surveys use complex sampling plans in order to reduce costs and increase the estimation efficiency for subgroups of the population. These sampling plans result in unequal inclusion probabilities across units in the population. The purpose of this paper is to derive the asymptotic properties of a design-based nonparametric regression estimator under a combined inference framework. The nonparametric regression estimator considered is the local constant estimator. This work contributes to the literature in two ways. First, it derives the asymptotic properties for the multivariate mixed-data case, including the asymptotic normality of the estimator. Second, I use least squares cross-validation for selecting the bandwidths for both continuous and discrete variables. I run Monte Carlo simulations designed to assess the finite-sample performance of the design-based local constant estimator versus the traditional local constant estimator for three sampling methods, namely, simple random sampling, exogenous stratification and endogenous stratification. Simulation results show that the estimator is consistent and that efficiency gains can be achieved by weighting observations by the inverse of their inclusion probabilities if the sampling is endogenous.
I am grateful for the input and guidance I received from Dr. Jeff Racine, Dr. Jerry Hurley, Dr. Phil DeCicca, Dr. Arthur Sweetman, and Dr. Michael Veall. Furthermore, I would like to thank participants at various seminars and conferences for their feedback.
Clair, L. (2019), "Nonparametric Kernel Regression Using Complex Survey Data", The Econometrics of Complex Survey Data (Advances in Econometrics, Vol. 39), Emerald Publishing Limited, Bingley, pp. 173-208. https://doi.org/10.1108/S0731-905320190000039011
Emerald Publishing Limited
Copyright © 2019 Emerald Publishing Limited