Search results

1 – 10 of over 11000
Article
Publication date: 4 April 2008

Sherif Sakr

Estimating the sizes of query results and intermediate results is crucial to many aspects of query processing. All database systems rely on the use of cardinality estimates to…

1604

Abstract

Purpose

Estimating the sizes of query results and intermediate results is crucial to many aspects of query processing. All database systems rely on the use of cardinality estimates to choose the cheapest execution plan. In principle, the problem of cardinality estimation is more complicated in the Extensible Markup Language (XML) domain than the relational domain. The purpose of this paper is to present a novel framework for estimating the cardinality of XQuery expressions as well as their sub‐expressions. Additionally, this paper proposes a novel XQuery cardinality estimation benchmark. The main aim of this benchmark is to establish the basis of comparison between the different estimation approaches in the XQuery domain.

Design/methodology/approach

As a major innovation, the paper exploits the relational algebraic infrastructure to provide accurate estimation in the context of XML and XQuery domains. In the proposed framework, XQuery expressions are translated into an equivalent relational algebraic plans and then using a well defined set of inference rules and a set of special properties of the algebraic plan, this framework is able to provide high‐accurate estimation for XQuery expressions.

Findings

This paper is believed to be the first which provides a uniform framework to estimate the cardinality of more powerful XML querying capabilities using XQuery expressions as well as their sub‐expressions. It exploits the relational algebraic infrastructure to provide accurate estimation in the context of XML and XQuery domains. Moreover, the proposed framework can act as a meta‐model through its ability to incorporate different summarized XML structures and different histogram techniques which allows the model designers to achieve their targets by focusing their effort on designing or selecting the adequate techniques for them. In addition, this paper proposes benchmark for XQuery cardinality estimation systems. The proposed benchmark distinguishes itself from the other existing XML benchmarks in its focus on establishing the basis for comparing the different estimation approaches in the XML domain in terms of their accuracy of the estimations and their completeness in handling different XML querying features.

Research limitations/implications

The current status of this proposed XQuery cardinality estimations framework does not support the estimation of the queries over the order information of the source XML documents and does not support non‐numeric predicates.

Practical implications

The experiments of this XQuery cardinality estimation system demonstrate its effectiveness and show high‐accurate estimation results. Utilizing the cardinality estimation properties during the SQL translation of XQuery expression results in an average improvement of 20 percent on the performance of their execution times.

Originality/value

This paper presents a novel framework for estimating the cardinality of XQuery expressions as well as its sub‐expressions. A novel XQuery cardinality estimation benchmark is introduced to establish the basis of comparison between the different estimation approaches in the XQuery domain.

Details

International Journal of Web Information Systems, vol. 4 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 15 May 2017

Felix Canitz, Panagiotis Ballis-Papanastasiou, Christian Fieberg, Kerstin Lopatta, Armin Varmaz and Thomas Walker

The purpose of this paper is to review and evaluate the methods commonly used in accounting literature to correct for cointegrated data and data that are neither stationary nor…

Abstract

Purpose

The purpose of this paper is to review and evaluate the methods commonly used in accounting literature to correct for cointegrated data and data that are neither stationary nor cointegrated.

Design/methodology/approach

The authors conducted Monte Carlo simulations according to Baltagi et al. (2011), Petersen (2009) and Gow et al. (2010), to analyze how regression results are affected by the possible nonstationarity of the variables of interest.

Findings

The results of this study suggest that biases in regression estimates can be reduced and valid inferences can be obtained by using robust standard errors clustered by firm, clustered by firm and time or Fama–MacBeth t-statistics based on the mean and standard errors of the cross section of coefficients from time-series regressions.

Originality/value

The findings of this study are suited to guide future researchers regarding which estimation methods are the most reliable given the possible nonstationarity of the variables of interest.

Details

The Journal of Risk Finance, vol. 18 no. 3
Type: Research Article
ISSN: 1526-5943

Keywords

Book part
Publication date: 30 December 2004

James P. LeSage and R. Kelley Pace

For this discussion, assume there are n sample observations of the dependent variable y at unique locations. In spatial samples, often each observation is uniquely associated with…

Abstract

For this discussion, assume there are n sample observations of the dependent variable y at unique locations. In spatial samples, often each observation is uniquely associated with a particular location or region, so that observations and regions are equivalent. Spatial dependence arises when an observation at one location, say y i is dependent on “neighboring” observations y j, y j∈ϒi. We use ϒi to denote the set of observations that are “neighboring” to observation i, where some metric is used to define the set of observations that are spatially connected to observation i. For general definitions of the sets ϒi,i=1,…,n, typically at least one observation exhibits simultaneous dependence, so that an observation y j, also depends on y i. That is, the set ϒj contains the observation y i, creating simultaneous dependence among observations. This situation constitutes a difference between time series analysis and spatial analysis. In time series, temporal dependence relations could be such that a “one-period-behind relation” exists, ruling out simultaneous dependence among observations. The time series one-observation-behind relation could arise if spatial observations were located along a line and the dependence of each observation were strictly on the observation located to the left. However, this is not in general true of spatial samples, requiring construction of estimation and inference methods that accommodate the more plausible case of simultaneous dependence among observations.

Details

Spatial and Spatiotemporal Econometrics
Type: Book
ISBN: 978-0-76231-148-4

Book part
Publication date: 13 May 2017

Jasjeet S. Sekhon and Rocío Titiunik

We discuss the two most popular frameworks for identification, estimation and inference in regression discontinuity (RD) designs: the continuity-based framework, where the…

Abstract

We discuss the two most popular frameworks for identification, estimation and inference in regression discontinuity (RD) designs: the continuity-based framework, where the conditional expectations of the potential outcomes are assumed to be continuous functions of the score at the cutoff, and the local randomization framework, where the treatment assignment is assumed to be as good as randomized in a neighborhood around the cutoff. Using various examples, we show that (i) assuming random assignment of the RD running variable in a neighborhood of the cutoff implies neither that the potential outcomes and the treatment are statistically independent, nor that the potential outcomes are unrelated to the running variable in this neighborhood; and (ii) assuming local independence between the potential outcomes and the treatment does not imply the exclusion restriction that the score affects the outcomes only through the treatment indicator. Our discussion highlights key distinctions between “locally randomized” RD designs and real experiments, including that statistical independence and random assignment are conceptually different in RD contexts, and that the RD treatment assignment rule places no restrictions on how the score and potential outcomes are related. Our findings imply that the methods for RD estimation, inference, and falsification used in practice will necessarily be different (both in formal properties and in interpretation) according to which of the two frameworks is invoked.

Details

Regression Discontinuity Designs
Type: Book
ISBN: 978-1-78714-390-6

Keywords

Book part
Publication date: 30 August 2019

Md. Nazmul Ahsan and Jean-Marie Dufour

Statistical inference (estimation and testing) for the stochastic volatility (SV) model Taylor (1982, 1986) is challenging, especially likelihood-based methods which are difficult…

Abstract

Statistical inference (estimation and testing) for the stochastic volatility (SV) model Taylor (1982, 1986) is challenging, especially likelihood-based methods which are difficult to apply due to the presence of latent variables. The existing methods are either computationally costly and/or inefficient. In this paper, we propose computationally simple estimators for the SV model, which are at the same time highly efficient. The proposed class of estimators uses a small number of moment equations derived from an ARMA representation associated with the SV model, along with the possibility of using “winsorization” to improve stability and efficiency. We call these ARMA-SV estimators. Closed-form expressions for ARMA-SV estimators are obtained, and no numerical optimization procedure or choice of initial parameter values is required. The asymptotic distributional theory of the proposed estimators is studied. Due to their computational simplicity, the ARMA-SV estimators allow one to make reliable – even exact – simulation-based inference, through the application of Monte Carlo (MC) test or bootstrap methods. We compare them in a simulation experiment with a wide array of alternative estimation methods, in terms of bias, root mean square error and computation time. In addition to confirming the enormous computational advantage of the proposed estimators, the results show that ARMA-SV estimators match (or exceed) alternative estimators in terms of precision, including the widely used Bayesian estimator. The proposed methods are applied to daily observations on the returns for three major stock prices (Coca-Cola, Walmart, Ford) and the S&P Composite Price Index (2000–2017). The results confirm the presence of stochastic volatility with strong persistence.

Details

Topics in Identification, Limited Dependent Variables, Partial Observability, Experimentation, and Flexible Modeling: Part A
Type: Book
ISBN: 978-1-78973-241-2

Keywords

Abstract

Details

Applying Maximum Entropy to Econometric Problems
Type: Book
ISBN: 978-0-76230-187-4

Abstract

This article surveys recent developments in the evaluation of point and density forecasts in the context of forecasts made by vector autoregressions. Specific emphasis is placed on highlighting those parts of the existing literature that are applicable to direct multistep forecasts and those parts that are applicable to iterated multistep forecasts. This literature includes advancements in the evaluation of forecasts in population (based on true, unknown model coefficients) and the evaluation of forecasts in the finite sample (based on estimated model coefficients). The article then examines in Monte Carlo experiments the finite-sample properties of some tests of equal forecast accuracy, focusing on the comparison of VAR forecasts to AR forecasts. These experiments show the tests to behave as should be expected given the theory. For example, using critical values obtained by bootstrap methods, tests of equal accuracy in population have empirical size about equal to nominal size.

Details

VAR Models in Macroeconomics – New Developments and Applications: Essays in Honor of Christopher A. Sims
Type: Book
ISBN: 978-1-78190-752-8

Keywords

Content available
Book part
Publication date: 13 May 2017

Abstract

Details

Regression Discontinuity Designs
Type: Book
ISBN: 978-1-78714-390-6

Book part
Publication date: 23 June 2016

Bao Yong, Fan Yanqin, Su Liangjun and Zinde-Walsh Victoria

This paper examines Aman Ullah’s contributions to robust inference, finite sample econometrics, nonparametrics and semiparametrics, and panel and spatial models. His early works…

Abstract

This paper examines Aman Ullah’s contributions to robust inference, finite sample econometrics, nonparametrics and semiparametrics, and panel and spatial models. His early works on robust inference and finite sample theory were mostly motivated by his thesis advisor, Professor Anirudh Lal Nagar. They eventually led to his most original rethinking of many statistics and econometrics models that developed into the monograph Finite Sample Econometrics published in 2004. His desire to relax distributional and functional-form assumptions lead him in the direction of nonparametric estimation and he summarized his views in his most influential textbook Nonparametric Econometrics (with Adrian Pagan) published in 1999 that has influenced a whole generation of econometricians. His innovative contributions in the areas of seemingly unrelated regressions, parametric, semiparametric and nonparametric panel data models, and spatial models have also inspired a larger literature on nonparametric and semiparametric estimation and inference and spurred on research in robust estimation and inference in these and related areas.

Book part
Publication date: 21 November 2014

Ryan Greenaway-McGrevy, Chirok Han and Donggyu Sul

This paper is concerned with estimation and inference for difference-in-difference regressions with errors that exhibit high serial dependence, including near unit roots, unit…

Abstract

This paper is concerned with estimation and inference for difference-in-difference regressions with errors that exhibit high serial dependence, including near unit roots, unit roots, and linear trends. We propose a couple of solutions based on a parametric formulation of the error covariance. First stage estimates of autoregressive structures are obtained by using the Han, Phillips, and Sul (2011, 2013) X-differencing transformation. The X-differencing method is simple to implement and is unbiased in large N settings. Compared to similar parametric methods, the approach is computationally simple and requires fewer restrictions on the permissible parameter space of the error process. Simulations suggest that our methods perform well in the finite sample across a wide range of panel dimensions and dependence structures.

1 – 10 of over 11000