Table of contents(21 chapters)
Identification and inference are central to applied analysis, and two papers examine these issues, the first being theoretical in nature and the second being simulation based.
In this paper, we study partial identification of the distribution of treatment effects of a binary treatment for ideal randomized experiments, ideal randomized experiments with a known value of a dependence measure, and for data satisfying the selection-on-observables assumption, respectively. For ideal randomized experiments, (i) we propose nonparametric estimators of the sharp bounds on the distribution of treatment effects and construct asymptotically valid confidence sets for the distribution of treatment effects; (ii) we propose bias-corrected estimators of the sharp bounds on the distribution of treatment effects; and (iii) we investigate finite sample performances of the proposed confidence sets and the bias-corrected estimators via simulation.
The link between the magnitude of a bandwidth and the relevance of the corresponding covariate in a regression has recently garnered theoretical attention. Theory suggests that variables included erroneously in a regression will be automatically removed when bandwidths are selected via cross-validation procedure. However, the connections between the bandwidths of the variables that are smoothed away and the insights from these same variables when properly tested for statistical significance have not been previously studied. This paper proposes a variety of simulation exercises to examine the relative performance of both cross-validated bandwidths and individual and joint tests of significance. We focus on settings where the hypothesis of interest may focus on a single data type (e.g., continuous only) or a mix of discrete and continuous variables. Moreover, we propose an extension of a well-known kernel smoothing significance test to handle mixed data types. Our results suggest that individual tests of significance and variable-specific bandwidths are very close in performance, but joint tests and joint bandwidth recognition produce substantially different results. This underscores the importance of testing for joint significance when one is trying to arrive at the final nonparametric model of interest.
We consider the problem of estimating a varying coefficient panel data model with fixed-effects (FE) using a local linear regression approach. Unlike first-differenced estimator, our proposed estimator removes FE using kernel-based weights. This results a one-step estimator without using the backfitting technique. The computed estimator is shown to be asymptotically normally distributed. A modified least-squared cross-validatory method is used to select the optimal bandwidth automatically. Moreover, we propose a test statistic for testing the null hypothesis of a random-effects varying coefficient panel data model against an FE one. Monte Carlo simulations show that our proposed estimator and test statistic have satisfactory finite sample performance.
We propose a local linear functional coefficient estimator that admits a mix of discrete and continuous data for stationary time series. Under weak conditions our estimator is asymptotically normally distributed. A small set of simulation studies is carried out to illustrate the finite sample performance of our estimator. As an application, we estimate a wage determination function that explicitly allows the return to education to depend on other variables. We find evidence of the complex interacting patterns among the regressors in the wage equation, such as increasing returns to education when experience is very low, high return to education for workers with several years of experience, and diminishing returns to education when experience is high. Compared with the commonly used parametric and semiparametric methods, our estimator performs better in both goodness-of-fit and in yielding economically interesting interpretation.
In this paper we investigate the joint conditional distribution of health (life expectancy) and income growth, and its evolution over time. The conditional distributions of these two variables are obtained by applying non-parametric methods to a bivariate non-parametric regression system of equations. Analyzing the distributions of the non-parametric fitted values from these models we find strong evidence of movement over time and strong evidence of first-order stochastic dominance of the earlier years over the later ones. We also find strong evidence of second-order stochastic dominance by non-OECD countries over OECD countries in each period. Our results complement the findings of Wu, Savvides and Stengos (2008) who explored the unconditional behaviour of these joint distributions over time.
Conventional wisdom dictates that there is a positive relationship between governance and growth. This article reexamines this empirical relationship using nonparametric quantile methods. We apply these methods on different levels of countries' growth and governance measures as defined in World Governance Indicators provided by the World Bank. We concentrate our analysis on three of the six measures: voice and accountability, political stability, and rule of law that were found to be significantly correlated with economic growth. To illustrate the nonparametric quantile analysis we use growth profile curves as a visual device. We find that the empirical relationship between voice and accountability, political stability, and growth are highly nonlinear at different quantiles. We also find heterogeneity in these effects across indicators, regions, time, and quantiles. These results are a cautionary tale to practitioners using parametric quantile methods.
This paper deals with estimation of risk and the risk preference function when producers face uncertainties in production (usually labeled as production risk) and output price. These uncertainties are modeled in the context of production theory where the objective of the producers is to maximize expected utility of normalized anticipated profit. Models are proposed to estimate risk preference of individual producers under (i) only production risk, (ii) only price risk, (iii) both production and price risks, (iv) production risk with technical inefficiency, (v) price risk with technical inefficiency, and (vi) both production and price risks with technical inefficiency. We discuss estimation of the production function, the output risk function, and the risk preference functions in some of these cases. Norwegian salmon farming data is used for an empirical application of some of the proposed models. We find that salmon farmers are, in general, risk averse. Labor is found to be risk decreasing while capital and feed are found to be risk increasing.
Knowledge of the dependence structure between financial assets is crucial to improve the performance in financial risk management. It is known that the copula completely summarizes the dependence structure among multiple variables. We propose a multivariate exponential series estimator (ESE) to estimate copula densities nonparametrically. The ESE has an appealing information-theoretic interpretation and attains the optimal rate of convergence for nonparametric density estimations in Stone (1982). More importantly, it overcomes the boundary bias of conventional nonparametric copula estimators. Our extensive Monte Carlo studies show the proposed estimator outperforms the kernel and the log-spline estimators in copula estimation. It also demonstrates that two-step density estimation through an ESE copula often outperforms direct estimation of joint densities. Finally, the ESE copula provides superior estimates of tail dependence compared to the empirical tail index coefficient. An empirical examination of the Asian financial markets using the proposed method is provided.
In this paper we construct a nonparametric kernel estimator to estimate the joint multivariate cumulative distribution function (CDF) of mixed discrete and continuous variables. We use a data-driven cross-validation method to choose optimal smoothing parameters which asymptotically minimize the mean integrated squared error (MISE). The asymptotic theory of the proposed estimator is derived, and the validity of the cross-validation method is proved. We provide sufficient and necessary conditions for the uniqueness of optimal smoothing parameters when the estimation of CDF degenerates to the case with only continuous variables, and provide a sufficient condition for the general mixed variables case.
A new, direct method is developed for reducing, to an arbitrary order, the boundary bias of kernel density and density derivative estimators. The basic asymptotic properties of the estimators are derived. Simple examples are provided. A number of simulations are reported, which demonstrate the viability and efficacy of the approach compared to several popular alternatives.
The R environment for statistical computing and graphics (R Development Core Team, 2008) offers practitioners a rich set of statistical methods ranging from random number generation and optimization methods through regression, panel data, and time series methods, by way of illustration. The standard R distribution (base R) comes preloaded with a rich variety of functionality useful for applied econometricians. This functionality is enhanced by user-supplied packages made available via R servers that are mirrored around the world. Of interest in this chapter are methods for estimating nonparametric and semiparametric models. We summarize many of the facilities in R and consider some tools that might be of interest to those wishing to work with nonparametric methods who want to avoid resorting to programming in C or Fortran but need the speed of compiled code as opposed to interpreted code such as Gauss or Matlab by way of example. We encourage those working in the field to strongly consider implementing their methods in the R environment thereby making their work accessible to the widest possible audience via an open collaborative forum.
This paper gives a selective review on some recent developments of nonparametric methods in both continuous and discrete time finance, particularly in the areas of nonparametric estimation and testing of diffusion processes, nonparametric testing of parametric diffusion models, nonparametric pricing of derivatives, nonparametric estimation and hypothesis testing for nonlinear pricing kernel, and nonparametric predictability of asset returns. For each financial context, the paper discusses the suitable statistical concepts, models, and modeling procedures, as well as some of their applications to financial data. Their relative strengths and weaknesses are discussed. Much theoretical and empirical research is needed in this area, and more importantly, the paper points to several aspects that deserve further investigation.
Economic conditions such as convexity, homogeneity, homotheticity, and monotonicity are all important assumptions or consequences of assumptions of economic functionals to be estimated. Recent research has seen a renewed interest in imposing constraints in nonparametric regression. We survey the available methods in the literature, discuss the challenges that present themselves when empirically implementing these methods, and extend an existing method to handle general nonlinear constraints. A heuristic discussion on the empirical implementation for methods that use sequential quadratic programming is provided for the reader, and simulated and empirical evidence on the distinction between constrained and unconstrained nonparametric regression surfaces is covered.
This is a survey paper of the recent literature on the application of semiparametric–econometric advances to testing for functional form of the environmental Kuznets curve (EKC). The EKC postulates that there is an inverted U-shaped relationship between economic growth (typically measured by income) and pollution; that is, as economic growth expands, pollution increases up to a maximum and then starts declining after a threshold level of income. This hypothesized relationship is simple to visualize but has eluded many empirical investigations. A typical application of the EKC uses panel data models, which allows for heterogeneity, serial correlation, heteroskedasticity, data pooling, and smooth coefficients. This vast literature is reviewed in the context of semiparametric model specification tests. Additionally, recent developments in semiparametric econometrics, such as Bayesian methods, generalized time-varying coefficient models, and nonstationary panels are discussed as fruitful areas of future research. The cited literature is fairly complete and should prove useful to applied researchers at large.
There is a growing literature in nonparametric econometrics in the recent two decades. Given the space limitation, it is impossible to survey all the important recent developments in nonparametric econometrics. Therefore, we choose to limit our focus on the following areas. In Section 2, we review the recent developments of nonparametric estimation and testing of regression functions with mixed discrete and continuous covariates. We discuss nonparametric estimation and testing of econometric models for nonstationary data in Section 3. Section 4 is devoted to surveying the literature of nonparametric instrumental variable (IV) models. We review nonparametric estimation of quantile regression models in Section 5. In Sections 2–5, we also point out some open research problems, which might be useful for graduate students to review the important research papers in this field and to search for their own research interests, particularly dissertation topics for doctoral students. Finally, in Section 6 we highlight some important research areas that are not covered in this paper due to space limitation. We plan to write a separate survey paper to discuss some of the omitted topics.