The Econometrics of Complex Survey Data: Vol. 39

Prelims

Pages i-xi

Content available

HTML

PDF (193 KB)

EPUB (181 KB)

Can Internet Match High-quality Traditional Surveys? Comparing the Health and Retirement Study and its Online Version

Pages 3-33

View access options

Abstract

We examine sample characteristics and elicited survey measures of two studies, the Health and Retirement Study (HRS), where interviews are done either in person or by phone, and the Understanding America Study (UAS), where surveys are completed online and a replica of the HRS core questionnaire is administered. By considering variables in various domains, our investigation provides a comprehensive assessment of how Internet data collection compares to more traditional interview modes. We document clear demographic differences between the UAS and HRS samples in terms of age and education. Yet, sample weights correct for these discrepancies and allow one to satisfactorily match population benchmarks as far as key socio- demographic variables are concerned. Comparison of a variety of survey outcomes with population targets shows a strikingly good fit for both the HRS and the UAS. Outcome distributions in the HRS are only marginally closer to population targets than outcome distributions in the UAS. These patterns arise regardless of which variables are used to construct post-stratification weights in the UAS, confirming the robustness of these results. We find little evidence of mode effects when comparing the subjective measures of self-reported health and life satisfaction across interview modes. Specifically, we do not observe very clear primacy or recency effects for either health or life satisfaction. We do observe a significant social desirability effect, driven by the presence of an interviewer, as far as life satisfaction is concerned. By and large, our results suggest that Internet surveys can match high-quality traditional surveys.

Chapter details

Citation: Angrisani, M., Finley, B. and Kapteyn, A. (2019), "Can Internet Match High-quality Traditional Surveys? Comparing the Health and Retirement Study and its Online Version", The Econometrics of Complex Survey Data (Advances in Econometrics, Vol. 39), Emerald Publishing Limited, Leeds, pp. 3-33. https://doi.org/10.1108/S0731-905320190000039001

HTML

PDF (588 KB)

EPUB (1.4 MB)

Effectiveness of Stratified Random Sampling for Payment Card Acceptance and Usage

Pages 35-57

View access options

Abstract

For central banks who study the use of cash, acceptance of card payments is an important factor. Surveys to measure levels of card acceptance and the costs of payments can be complicated and expensive. In this paper, we exploit a novel data set from Hungary to see the effect of stratified random sampling on estimates of payment card acceptance and usage. Using the Online Cashier Registry, a database linking the universe of merchant cash registers in Hungary, we create merchant and transaction level data sets. We compare county (geographic), industry and store size stratifications to simulate the usual stratification criteria for merchant surveys and see the effect on estimates of card acceptance for different sample sizes. Further, we estimate logistic regression models of card acceptance/usage to see how stratification biases estimates of key determinants of card acceptance/usage.

HTML

PDF (110 KB)

EPUB (1.1 MB)

Wild Bootstrap Randomization Inference for Few Treated Clusters

Pages 61-85

View access options

Abstract

When there are few treated clusters in a pure treatment or difference-in-differences setting, t tests based on a cluster-robust variance estimator can severely over-reject. Although procedures based on the wild cluster bootstrap often work well when the number of treated clusters is not too small, they can either over-reject or under-reject seriously when it is. In a previous paper, we showed that procedures based on randomization inference (RI) can work well in such cases. However, RI can be impractical when the number of possible randomizations is small. We propose a bootstrap-based alternative to RI, which mitigates the discrete nature of RI p values in the few-clusters case. We also compare it to two other procedures. None of them works perfectly when the number of clusters is very small, but they can work surprisingly well.

HTML

PDF (970 KB)

EPUB (757 KB)

Variance Estimation for Survey-Weighted Data Using Bootstrap Resampling Methods: 2013 Methods-of-Payment Survey Questionnaire

Pages 87-106

View access options

Abstract

Sampling units for the 2013 Methods-of-Payment survey were selected through an approximate stratified two-stage sampling design. To compensate for nonresponse and noncoverage and ensure consistency with external population counts, the observations are weighted through a raking procedure. We apply bootstrap resampling methods to estimate the variance, allowing for randomness from both the sampling design and raking procedure. We find that the variance is smaller when estimated through the bootstrap resampling method than through the naive linearization method, where the latter does not take into account the correlation between the variables used for weighting and the outcome variable of interest.

Chapter details

Citation: Chen, H. and Shen, Q.R. (2019), "Variance Estimation for Survey-Weighted Data Using Bootstrap Resampling Methods: 2013 Methods-of-Payment Survey Questionnaire", The Econometrics of Complex Survey Data (Advances in Econometrics, Vol. 39), Emerald Publishing Limited, Leeds, pp. 87-106. https://doi.org/10.1108/S0731-905320190000039004

HTML

PDF (117 KB)

EPUB (521 KB)

Model-Selection Tests for Complex Survey Samples

Pages 109-135

View access options

Abstract

We extend Vuong’s (1989) model-selection statistic to allow for complex survey samples. As a further extension, we use an M-estimation setting so that the tests apply to general estimation problems – such as linear and nonlinear least squares, Poisson regression and fractional response models, to name just a few – and not only to maximum likelihood settings. With stratified sampling, we show how the difference in objective functions should be weighted in order to obtain a suitable test statistic. Interestingly, the weights are needed in computing the model-selection statistic even in cases where stratification is appropriately exogenous, in which case the usual unweighted estimators for the parameters are consistent. With cluster samples and panel data, we show how to combine the weighted objective function with a cluster-robust variance estimator in order to expand the scope of the model-selection tests. A small simulation study shows that the weighted test is promising.

HTML

PDF (170 KB)

EPUB (474 KB)

Inference in Conditional Moment Restriction Models When there is Selection Due to Stratification

Pages 137-171

View access options

Abstract

We show how to use a smoothed empirical likelihood approach to conduct efficient semiparametric inference in models characterized as conditional moment equalities when data are collected by variable probability sampling. Results from a simulation experiment suggest that the smoothed empirical likelihood based estimator can estimate the model parameters very well in small to moderately sized stratified samples.

Chapter details

Citation: Cosma, A., Kostyrka, A.V. and Tripathi, G. (2019), "Inference in Conditional Moment Restriction Models When there is Selection Due to Stratification", The Econometrics of Complex Survey Data (Advances in Econometrics, Vol. 39), Emerald Publishing Limited, Leeds, pp. 137-171. https://doi.org/10.1108/S0731-905320190000039010

HTML

PDF (221 KB)

EPUB (1 MB)

Nonparametric Kernel Regression Using Complex Survey Data

Pages 173-208

View access options

Abstract

Applied econometric analysis is often performed using data collected from large-scale surveys. These surveys use complex sampling plans in order to reduce costs and increase the estimation efficiency for subgroups of the population. These sampling plans result in unequal inclusion probabilities across units in the population. The purpose of this paper is to derive the asymptotic properties of a design-based nonparametric regression estimator under a combined inference framework. The nonparametric regression estimator considered is the local constant estimator. This work contributes to the literature in two ways. First, it derives the asymptotic properties for the multivariate mixed-data case, including the asymptotic normality of the estimator. Second, I use least squares cross-validation for selecting the bandwidths for both continuous and discrete variables. I run Monte Carlo simulations designed to assess the finite-sample performance of the design-based local constant estimator versus the traditional local constant estimator for three sampling methods, namely, simple random sampling, exogenous stratification and endogenous stratification. Simulation results show that the estimator is consistent and that efficiency gains can be achieved by weighting observations by the inverse of their inclusion probabilities if the sampling is endogenous.

HTML

PDF (284 KB)

EPUB (1.3 MB)

Nearest Neighbor Imputation for General Parameter Estimation in Survey Sampling

Pages 209-234

View access options

Abstract

Nearest neighbor imputation has a long tradition for handling item nonresponse in survey sampling. In this article, we study the asymptotic properties of the nearest neighbor imputation estimator for general population parameters, including population means, proportions and quantiles. For variance estimation, we propose novel replication variance estimation, which is asymptotically valid and straightforward to implement. The main idea is to construct replicates of the estimator directly based on its asymptotically linear terms, instead of individual records of variables. The simulation results show that nearest neighbor imputation and the proposed variance estimation provide valid inferences for general population parameters.

HTML

PDF (160 KB)

EPUB (707 KB)

Improving Response Quality with Planned Missing Data: An Application to a Survey of Banks

Pages 237-258

View access options

Abstract

We survey banks to construct national estimates of total noncash payments by type, payments fraud and related information. The survey is designed to create aggregate total estimates of all payments in the United States using data from responses returned by a representative, random sample. In 2016, the number of questions in the survey doubled compared with the previous survey, raising serious concerns of smaller bank nonparticipation. To obtain sufficient response data for all questions from smaller banks, we administered a modified survey design which, in addition to randomly sampling banks, also randomly assigned one of several survey forms, subsets of the full survey. This case study illustrates that while several other factors influenced response outcomes, the approach helped ensure sufficient response for smaller banks. Using such an approach may be especially important in an optional-participation survey, when reducing costs to respondents may affect success, or when imputation of unplanned missing items is already needed for estimation. While a variety of factors affected the outcome, we find that the planned missing data approach improved response outcomes for smaller banks. The planned missing item design should be considered as a way of reducing survey burden or increasing unit-level and item-level responses for individual respondents without reducing the full set of survey items collected.

Chapter details

Citation: Gerdes, G.R. and Liu, X. (2019), "Improving Response Quality with Planned Missing Data: An Application to a Survey of Banks", The Econometrics of Complex Survey Data (Advances in Econometrics, Vol. 39), Emerald Publishing Limited, Leeds, pp. 237-258. https://doi.org/10.1108/S0731-905320190000039014

HTML

PDF (238 KB)

EPUB (775 KB)

Does Selective Crime Reporting Influence Our Ability to Detect Racial Discrimination in the Nypd’s Stop-and-Frisk Program?

Pages 259-286

View access options

Abstract

Prior analyses of racial bias in the New York City’s Stop-and-Frisk program implicitly assumed that potential bias of police officers did not vary by crime type and that their decision of which type of crime to report as the basis for the stop did not exhibit any bias. In this paper, we first extend the hit rates model to consider crime type heterogeneity in racial bias and police officer decisions of reported crime type. Second, we reevaluate the program while accounting for heterogeneity in bias along crime types and for the sample selection which may arise from conditioning on crime type. We present evidence that differences in biases across crime types are substantial and specification tests support incorporating corrections for selective crime reporting. However, the main findings on racial bias do not differ sharply once accounting for this choice-based selection.

Chapter details

Citation: Lehrer, S.F. and Lepage, L.-P. (2019), "Does Selective Crime Reporting Influence Our Ability to Detect Racial Discrimination in the Nypd’s Stop-and-Frisk Program?", The Econometrics of Complex Survey Data (Advances in Econometrics, Vol. 39), Emerald Publishing Limited, Leeds, pp. 259-286. https://doi.org/10.1108/S0731-905320190000039015

HTML

PDF (333 KB)

EPUB (1.1 MB)

Survey Evidence on Black Market Liquor in Colombia

Pages 287-314

View access options

Abstract

In 2014, the Colombian Government commissioned a unique national survey on illegal liquor. Interviewers purchased bottles of liquor from interviewees and tested them for authenticity in a laboratory. Two factors predict whether liquor is contraband (smuggled): (1) the absence of a receipt and (2) the presence of a discount offered by the seller. Neither factor predicts whether a bottle is adulterated. The results back a story in which sellers are complicit with a contraband economy, but whether buyers are complicit remains unclear. However, buyers are more likely to receive adulterated liquor when specifically asking for a discount.

Chapter details

Citation: Canavire-Bacarreza, G.J., Lundberg, A.L. and Montoya-Agudelo, A. (2019), "Survey Evidence on Black Market Liquor in Colombia", The Econometrics of Complex Survey Data (Advances in Econometrics, Vol. 39), Emerald Publishing Limited, Leeds, pp. 287-314. https://doi.org/10.1108/S0731-905320190000039016

HTML

PDF (811 KB)

EPUB (1.4 MB)

Index

Pages 315-325

Content available

HTML

PDF (65 KB)

EPUB (129 KB)

The Econometrics of Complex Survey Data: Volume 39

Theory and Applications

Table of contents

Prelims

Part I Sampling Design

Can Internet Match High-quality Traditional Surveys? Comparing the Health and Retirement Study and its Online Version

Abstract

Effectiveness of Stratified Random Sampling for Payment Card Acceptance and Usage

Abstract

Part II Variance Estimation

Wild Bootstrap Randomization Inference for Few Treated Clusters

Abstract

Variance Estimation for Survey-Weighted Data Using Bootstrap Resampling Methods: 2013 Methods-of-Payment Survey Questionnaire

Abstract

Part III Estimation and Inference

Model-Selection Tests for Complex Survey Samples

Abstract

Inference in Conditional Moment Restriction Models When there is Selection Due to Stratification

Abstract

Nonparametric Kernel Regression Using Complex Survey Data

Abstract

Nearest Neighbor Imputation for General Parameter Estimation in Survey Sampling

Abstract

Part IV Applications in Business, Household, and Crime Surveys

Improving Response Quality with Planned Missing Data: An Application to a Survey of Banks

Abstract

Does Selective Crime Reporting Influence Our Ability to Detect Racial Discrimination in the Nypd’s Stop-and-Frisk Program?

Abstract

Survey Evidence on Black Market Liquor in Colombia

Abstract

Index

Something didn’t work…

All feedback is valuable

Platform update page

Questions & More Information

Table of contents

Part I Sampling Design

Abstract

Abstract

Part II Variance Estimation

Abstract

Abstract

Part III Estimation and Inference

Abstract

Abstract

Abstract

Abstract

Part IV Applications in Business, Household, and Crime Surveys

Abstract

Abstract

Abstract

We’re listening — tell us what you think

Something didn’t work…

All feedback is valuable

Join us on our journey

Platform update page

Questions & More Information