Table of contents(13 chapters)
Part I: Introduction
Part II: Discrete Dependent Variables – Maximum Likelihood
Fast Simulated Maximum Likelihood Estimation of the Spatial Probit Model Capable of Handling Large Samples
We show how to quickly estimate spatial probit models for large data sets using maximum likelihood. Like Beron and Vijverberg (2004), we use the GHK (Geweke-Hajivassiliou-Keane) algorithm to perform maximum simulated likelihood estimation. However, using the GHK for large sample sizes has been viewed as extremely difficult (Wang, Iglesias, & Wooldridge, 2013). Nonetheless, for sparse covariance and precision matrices often encountered in spatial settings, the GHK can be applied to very large sample sizes as its operation counts and memory requirements increase almost linearly with n when using sparse matrix techniques.
Likelihood Evaluation of High-Dimensional Spatial Latent Gaussian Models with Non-Gaussian Response Variables
We propose a generic algorithm for numerically accurate likelihood evaluation of a broad class of spatial models characterized by a high-dimensional latent Gaussian process and non-Gaussian response variables. The class of models under consideration includes specifications for discrete choices, event counts and limited-dependent variables (truncation, censoring, and sample selection) among others. Our algorithm relies upon a novel implementation of efficient importance sampling (EIS) specifically designed to exploit typical sparsity of high-dimensional spatial precision (or covariance) matrices. It is numerically very accurate and computationally feasible even for very high-dimensional latent processes. Thus, maximum likelihood (ML) estimation of high-dimensional non-Gaussian spatial models, hitherto considered to be computationally prohibitive, becomes feasible. We illustrate our approach with ML estimation of a spatial probit for US presidential voting decisions and spatial count data models (Poisson and Negbin) for firm location choices.
Part III: Discrete Dependent Variables – Bayesian
This paper investigates the impact of the storms Katrina and Rita on firm survival in the Orleans Parish. In particular, a Bayesian spatial probit model is used to assess the impact of a number of firm characteristics on firm survival. The results reveal that larger firms and those with less flooding are more likely to survive. Larger chain stores were less likely to return to the city than sole proprietorships. Spatial results also reveal a very strong spatial component to firm survival just after the storm which diminishes as time passed.
This paper formulates and analyzes Bayesian model variants for the analysis of systems of spatial panel data with binary-dependent variables. The paper focuses on cases where latent variables of cross-sectional units in an equation of the system contemporaneously depend on the values of the same and, eventually, other latent variables of other cross-sectional units. Moreover, the paper discusses cases where time-invariant effects are exogenous versus endogenous. Such models may have numerous applications in industrial economics, public economics, or international economics. The paper illustrates that the performance of Bayesian estimation methods for such models is supportive of their use with even relatively small panel data sets.
The most used spatial regression models for binary-dependent variable consider a symmetric link function, such as the logistic or the probit models. When the dependent variable represents a rare event, a symmetric link function can underestimate the probability that the rare event occurs. Following Calabrese and Osmetti (2013), we suggest the quantile function of the generalized extreme value (GEV) distribution as link function in a spatial generalized linear model and we call this model the spatial GEV (SGEV) regression model. To estimate the parameters of such model, a modified version of the Gibbs sampling method of Wang and Dey (2010) is proposed. We analyze the performance of our model by Monte Carlo simulations and evaluate the prediction accuracy in empirical data on state failure.
This paper analyzes county-level firm births across the United States using a spatial count model that permits spatial dependence, cross-correlation among different industry types, and over-dispersion commonly found in empirical count data. Results confirm the presence of spatial autocorrelation (which can arise from agglomeration effects and missing variables), industry-specific over-dispersion, and positive, significant cross-correlations. After controlling for existing-firm counts in 2008 (as an exposure term), parameter estimates and inference suggest that a younger work force and/or clientele (as quantified using each county’s median-age values) is associated with more firm births (in 2009). Higher population densities is associated with more new basic-sector firms, while reducing retail-firm starts. The modeling framework demonstrated here can be adopted for a variety of settings, harnessing very local, detailed data to evaluate the effectiveness of investments and policies, in terms of generating business establishments and promoting economic gains.
A Multivariate Spatial-Time of Day Analysis of Truck Crash Frequency across Neighborhoods in New York City
To address the safety concerns generated by truck crashes occurred in big cities, this paper analyzes the zip code tabulation area (ZCTA)-based truck crash frequency across four temporal intervals – morning (6:00–10:00), mid-day (10:00–15:00), afternoon (15:00–19:00), and night (19:00–6:00) in New York City in 2010. A multivariate conditional autoregressive count model is used to recognize both spatial and temporal dependences. The results prove the presence of spatial and temporal dependencies for truck crashes that occurred in neighboring areas. Built environment attributes such as various types of business establishment density and traffic volume for different types of vehicles, which are important factors to consider for crashes occurred in an urban setting, are also examined in the study.
Part IV: Continuous Dependent Variables – Maximum Likelihood
This paper tests the feasibility and empirical implications of a spatial econometric model with a full set of interaction effects and weight matrix defined as an equally weighted group interaction matrix applied to research productivity of individuals. We also elaborate two extensions of this model, namely with group fixed effects and with heteroskedasticity. In our setting, the model with a full set of interaction effects is overparameterised: only the SDM and SDEM specifications produce acceptable results. They imply comparable spillover effects, but by applying a Bayesian approach taken from LeSage (2014), we are able to show that the SDEM specification is more appropriate and thus that colleague interaction effects work through observed and unobserved exogenous characteristics common to researchers within a group.
How to Measure Spillover Effects of Public Capital Stock: A Spatial Autoregressive Stochastic Frontier Model
This paper aims to investigate spillover effects of public capital stock in a production function model that accounts for spatial dependencies. In many settings, ignoring spatial dependency yields inefficient, biased and inconsistent estimates in cross country panels. Although there are a number of studies aiming to estimate the output elasticity of public capital stock, many of those fail to reach a consensus on refining the elasticity estimates. We argue that accounting for spillover effects of the public capital stock on the production efficiency and incorporating spatial dependences are crucial. For this purpose, we employ a spatial autoregressive stochastic frontier model based on a number of specifications of the spatial dependency structure. Using the data of 21 OECD countries from 1960 to 2001, we estimate a spatial autoregressive stochastic frontier model and derive the mean indirect marginal effects of public capital stock, which are interpreted as spillover effects. We found that spillover effects can be an important factor explaining variations in technical inefficiency across countries as well as in explaining the discrepancies among various levels of output elasticity of public capital stock in traditional production function approaches.
Part V: Continuous Dependent Variables – Bayesian
Local Marginal Analysis of Spatial Data: A Gaussian Process Regression Approach with Bayesian Model and Kernel Averaging
Statistical methods of spatial analysis are often successful at either prediction or explanation, but not necessarily both. In a recent paper, Dearmon and Smith (2016) showed that by combining Gaussian Process Regression (GPR) with Bayesian Model Averaging (BMA), a modeling framework could be developed in which both needs are addressed. In particular, the smoothness properties of GPR together with the robustness of BMA allow local spatial analyses of individual variable effects that yield remarkably stable results. However, this GPR-BMA approach is not without its limitations. In particular, the standard (isotropic) covariance kernel of GPR treats all explanatory variables in a symmetric way that limits the analysis of their individual effects. Here we extend this approach by introducing a mixture of kernels (both isotropic and anisotropic) which allow different length scales for each variable. To do so in a computationally efficient manner, we also explore a number of Bayes-factor approximations that avoid the need for costly reversible-jump Monte Carlo methods.
To demonstrate the effectiveness of this Variable Length Scale (VLS) model in terms of both predictions and local marginal analyses, we employ selected simulations to compare VLS with Geographically Weighted Regression (GWR), which is currently the most popular method for such spatial modeling. In addition, we employ the classical Boston Housing data to compare VLS not only with GWR but also with other well-known spatial regression models that have been applied to this same data. Our main results are to show that VLS not only compares favorably with spatial regression at the aggregate level but is also far more accurate than GWR at the local level.
City and Industry Network Impacts on Innovation by Chinese Manufacturing Firms: A Hierarchical Spatial-Interindustry Model
We are interested in modeling the impact of spatial and interindustry dependence on firm-level innovation of Chinese firms The existence of network ties between cities imply that changes taking place in one city could influence innovation by firms in nearby cities (local spatial spillovers), or set in motion a series of spatial diffusion and feedback impacts across multiple cities (global spatial spillovers). We use the term local spatial spillovers to reflect a scenario where only immediately neighboring cities are impacted, whereas the term global spatial spillovers represent a situation where impacts fall on neighboring cities, as well as higher order neighbors (neighbors to the neighboring cities, neighbors to the neighbors of the neighbors, and so on). Global spatial spillovers also involve feedback impacts from neighboring cities, and imply the existence of a wider diffusion of impacts over space (higher order neighbors).
Similarly, the existence of national interindustry input-output ties implies that changes occurring in one industry could influence innovation by firms operating in directly related industries (local interindustry spillovers), or set in motion a series of in interindustry diffusion and feedback impacts across multiple industries (global interindustry spillovers).
Typical linear models of firm-level innovation based on knowledge production functions would rely on city- and industry-specific fixed effects to allow for differences in the level of innovation by firms located in different cities and operating in different industries. This approach however ignores the fact that, spatial dependence between cities and interindustry dependence arising from input-output relationships, may imply interaction, not simply heterogeneity across cities and industries.
We construct a Bayesian hierarchical model that allows for both city- and industry-level interaction (global spillovers) and subsumes other innovation scenarios such as: (1) heterogeneity that implies level differences (fixed effects) and (2) contextual effects that imply local spillovers as special cases.