Search results

1 – 10 of 108
Book part
Publication date: 10 April 2019

Heng Chen and Q. Rallye Shen

Sampling units for the 2013 Methods-of-Payment survey were selected through an approximate stratified two-stage sampling design. To compensate for nonresponse and noncoverage and…

Abstract

Sampling units for the 2013 Methods-of-Payment survey were selected through an approximate stratified two-stage sampling design. To compensate for nonresponse and noncoverage and ensure consistency with external population counts, the observations are weighted through a raking procedure. We apply bootstrap resampling methods to estimate the variance, allowing for randomness from both the sampling design and raking procedure. We find that the variance is smaller when estimated through the bootstrap resampling method than through the naive linearization method, where the latter does not take into account the correlation between the variables used for weighting and the outcome variable of interest.

Details

The Econometrics of Complex Survey Data
Type: Book
ISBN: 978-1-78756-726-9

Keywords

Content available
Book part
Publication date: 10 April 2019

Abstract

Details

The Econometrics of Complex Survey Data
Type: Book
ISBN: 978-1-78756-726-9

Book part
Publication date: 1 September 2021

Son Nguyen, Phyllis Schumacher, Alan Olinsky and John Quinn

We study the performances of various predictive models including decision trees, random forests, neural networks, and linear discriminant analysis on an imbalanced data set of…

Abstract

We study the performances of various predictive models including decision trees, random forests, neural networks, and linear discriminant analysis on an imbalanced data set of home loan applications. During the process, we propose our undersampling algorithm to cope with the issues created by the imbalance of the data. Our technique is shown to work competitively against popular resampling techniques such as random oversampling, undersampling, synthetic minority oversampling technique (SMOTE), and random oversampling examples (ROSE). We also investigate the relation between the true positive rate, true negative rate, and the imbalance of the data.

Book part
Publication date: 6 September 2019

Son Nguyen, Gao Niu, John Quinn, Alan Olinsky, Jonathan Ormsbee, Richard M. Smith and James Bishop

In recent years, the problem of classification with imbalanced data has been growing in popularity in the data-mining and machine-learning communities due to the emergence of an…

Abstract

In recent years, the problem of classification with imbalanced data has been growing in popularity in the data-mining and machine-learning communities due to the emergence of an abundance of imbalanced data in many fields. In this chapter, we compare the performance of six classification methods on an imbalanced dataset under the influence of four resampling techniques. These classification methods are the random forest, the support vector machine, logistic regression, k-nearest neighbor (KNN), the decision tree, and AdaBoost. Our study has shown that all of the classification methods have difficulty when working with the imbalanced data, with the KNN performing the worst, detecting only 27.4% of the minority class. However, with the help of resampling techniques, all of the classification methods experience improvement on overall performances. In particular, the Random Forest, in combination with the random over-sampling technique, performs the best, achieving 82.8% balanced accuracy (the average of the true-positive rate and true-negative rate).

We then propose a new procedure to resample the data. Our method is based on the idea of eliminating “easy” majority observations before under-sampling them. It has further improved the balanced accuracy of the Random Forest to 83.7%, making it the best approach for the imbalanced data.

Details

Advances in Business and Management Forecasting
Type: Book
ISBN: 978-1-78754-290-7

Keywords

Book part
Publication date: 18 January 2022

Badi H. Baltagi, Georges Bresson, Anoop Chaturvedi and Guy Lacroix

This chapter extends the work of Baltagi, Bresson, Chaturvedi, and Lacroix (2018) to the popular dynamic panel data model. The authors investigate the robustness of Bayesian panel…

Abstract

This chapter extends the work of Baltagi, Bresson, Chaturvedi, and Lacroix (2018) to the popular dynamic panel data model. The authors investigate the robustness of Bayesian panel data models to possible misspecification of the prior distribution. The proposed robust Bayesian approach departs from the standard Bayesian framework in two ways. First, the authors consider the ε-contamination class of prior distributions for the model parameters as well as for the individual effects. Second, both the base elicited priors and the ε-contamination priors use Zellner’s (1986) g-priors for the variance–covariance matrices. The authors propose a general “toolbox” for a wide range of specifications which includes the dynamic panel model with random effects, with cross-correlated effects à la Chamberlain, for the Hausman–Taylor world and for dynamic panel data models with homogeneous/heterogeneous slopes and cross-sectional dependence. Using a Monte Carlo simulation study, the authors compare the finite sample properties of the proposed estimator to those of standard classical estimators. The chapter contributes to the dynamic panel data literature by proposing a general robust Bayesian framework which encompasses the conventional frequentist specifications and their associated estimation methods as special cases.

Details

Essays in Honor of M. Hashem Pesaran: Panel Modeling, Micro Applications, and Econometric Methodology
Type: Book
ISBN: 978-1-80262-065-8

Keywords

Book part
Publication date: 1 September 2021

Alicia T. Lamere, Son Nguyen, Gao Niu, Alan Olinsky and John Quinn

Predicting a patient's length of stay (LOS) in a hospital setting has been widely researched. Accurately predicting an individual's LOS can have a significant impact on a…

Abstract

Predicting a patient's length of stay (LOS) in a hospital setting has been widely researched. Accurately predicting an individual's LOS can have a significant impact on a healthcare provider's ability to care for individuals by allowing them to properly prepare and manage resources. A hospital's productivity requires a delicate balance of maintaining enough staffing and resources without being overly equipped or wasteful. This has become even more important in light of the current COVID-19 pandemic, during which emergency departments around the globe have been inundated with patients and are struggling to manage their resources.

In this study, the authors focus on the prediction of LOS at the time of admission in emergency departments at Rhode Island hospitals through discharge data obtained from the Rhode Island Department of Health over the time period of 2012 and 2013. This work also explores the distribution of discharge dispositions in an effort to better characterize the resources patients require upon leaving the emergency department.

Book part
Publication date: 29 February 2008

Nii Ayi Armah and Norman R. Swanson

In this chapter we discuss model selection and predictive accuracy tests in the context of parameter and model uncertainty under recursive and rolling estimation schemes. We begin…

Abstract

In this chapter we discuss model selection and predictive accuracy tests in the context of parameter and model uncertainty under recursive and rolling estimation schemes. We begin by summarizing some recent theoretical findings, with particular emphasis on the construction of valid bootstrap procedures for calculating the impact of parameter estimation error. We then discuss the Corradi and Swanson (2002) (CS) test of (non)linear out-of-sample Granger causality. Thereafter, we carry out a series of Monte Carlo experiments examining the properties of the CS and a variety of other related predictive accuracy and model selection type tests. Finally, we present the results of an empirical investigation of the marginal predictive content of money for income, in the spirit of Stock and Watson (1989), Swanson (1998) and Amato and Swanson (2001).

Details

Forecasting in the Presence of Structural Breaks and Model Uncertainty
Type: Book
ISBN: 978-1-84950-540-6

Book part
Publication date: 13 May 2017

Otávio Bartalotti, Gray Calhoun and Yang He

This chapter develops a novel bootstrap procedure to obtain robust bias-corrected confidence intervals in regression discontinuity (RD) designs. The procedure uses a wild…

Abstract

This chapter develops a novel bootstrap procedure to obtain robust bias-corrected confidence intervals in regression discontinuity (RD) designs. The procedure uses a wild bootstrap from a second-order local polynomial to estimate the bias of the local linear RD estimator; the bias is then subtracted from the original estimator. The bias-corrected estimator is then bootstrapped itself to generate valid confidence intervals (CIs). The CIs generated by this procedure are valid under conditions similar to Calonico, Cattaneo, and Titiunik’s (2014) analytical correction – that is, when the bias of the naive RD estimator would otherwise prevent valid inference. This chapter also provides simulation evidence that our method is as accurate as the analytical corrections and we demonstrate its use through a reanalysis of Ludwig and Miller’s (2007) Head Start dataset.

Details

Regression Discontinuity Designs
Type: Book
ISBN: 978-1-78714-390-6

Keywords

Book part
Publication date: 12 December 2003

R.Carter Hill, Lee C. Adkins and Keith A. Bender

The Heckman two-step estimator (Heckit) for the selectivity model is widely applied in Economics and other social sciences. In this model a non-zero outcome variable is observed…

Abstract

The Heckman two-step estimator (Heckit) for the selectivity model is widely applied in Economics and other social sciences. In this model a non-zero outcome variable is observed only if a latent variable is positive. The asymptotic covariance matrix for a two-step estimation procedure must account for the estimation error introduced in the first stage. We examine the finite sample size of tests based on alternative covariance matrix estimators. We do so by using Monte Carlo experiments to evaluate bootstrap generated critical values and critical values based on asymptotic theory.

Details

Maximum Likelihood Estimation of Misspecified Models: Twenty Years Later
Type: Book
ISBN: 978-1-84950-253-5

Abstract

Details

Nonlinear Time Series Analysis of Business Cycles
Type: Book
ISBN: 978-0-44451-838-5

1 – 10 of 108