Search results

1 – 10 of over 184000
Open Access
Article
Publication date: 6 February 2020

Jun Liu, Asad Khattak, Lee Han and Quan Yuan

Individuals’ driving behavior data are becoming available widely through Global Positioning System devices and on-board diagnostic systems. The incoming data can be sampled at…

1339

Abstract

Purpose

Individuals’ driving behavior data are becoming available widely through Global Positioning System devices and on-board diagnostic systems. The incoming data can be sampled at rates ranging from one Hertz (or even lower) to hundreds of Hertz. Failing to capture substantial changes in vehicle movements over time by “undersampling” can cause loss of information and misinterpretations of the data, but “oversampling” can waste storage and processing resources. The purpose of this study is to empirically explore how micro-driving decisions to maintain speed, accelerate or decelerate, can be best captured, without substantial loss of information.

Design/methodology/approach

This study creates a set of indicators to quantify the magnitude of information loss (MIL). Each indicator is calculated as a percentage to index the extent of information loss (EIL) in different situations. An overall information loss index named EIL is created to combine the MIL indicators. Data from a driving simulator study collected at 20 Hertz are analyzed (N = 718,481 data points from 35,924 s of driving tests). The study quantifies the relationship between information loss indicators and sampling rates.

Findings

The results show that marginally more information is lost as data are sampled down from 20 to 0.5 Hz, but the relationship is not linear. With four indicators of MILs, the overall EIL is 3.85 per cent for 1-Hz sampling rate driving behavior data. If sampling rates are higher than 2 Hz, all MILs are under 5 per cent for importation loss.

Originality/value

This study contributes by developing a framework for quantifying the relationship between sampling rates, and information loss and depending on the objective of their study, researchers can choose the appropriate sampling rate necessary to get the right amount of accuracy.

Details

Journal of Intelligent and Connected Vehicles, vol. 3 no. 1
Type: Research Article
ISSN: 2399-9802

Keywords

Abstract

Details

Handbook of Transport Modelling
Type: Book
ISBN: 978-0-08-045376-7

Article
Publication date: 18 November 2019

Guanying Huo, Xin Jiang, Zhiming Zheng and Deyi Xue

Metamodeling is an effective method to approximate the relations between input and output parameters when significant efforts of experiments and simulations are required to…

Abstract

Purpose

Metamodeling is an effective method to approximate the relations between input and output parameters when significant efforts of experiments and simulations are required to collect the data to build the relations. This paper aims to develop a new sequential sampling method for adaptive metamodeling by using the data with highly nonlinear relation between input and output parameters.

Design/methodology/approach

In this method, the Latin hypercube sampling method is used to sample the initial data, and kriging method is used to construct the metamodel. In this work, input parameter values for collecting the next output data to update the currently achieved metamodel are determined based on qualities of data in both the input and output parameter spaces. Uniformity is used to evaluate data in the input parameter space. Leave-one-out errors and sensitivities are considered to evaluate data in the output parameter space.

Findings

This new method has been compared with the existing methods to demonstrate its effectiveness in approximation. This new method has also been compared with the existing methods in solving global optimization problems. An engineering case is used at last to verify the method further.

Originality/value

This paper provides an effective sequential sampling method for adaptive metamodeling to approximate highly nonlinear relations between input and output parameters.

Details

Engineering Computations, vol. 37 no. 3
Type: Research Article
ISSN: 0264-4401

Keywords

Article
Publication date: 16 December 2019

Titan Ligita, Nichole Harvey, Kristin Wicking, Intansari Nurjannah and Karen Francis

The purpose of this paper is to discuss the practical use of theoretical sampling as a method for selecting data that provide a richer and deeper understanding of the phenomenon…

3445

Abstract

Purpose

The purpose of this paper is to discuss the practical use of theoretical sampling as a method for selecting data that provide a richer and deeper understanding of the phenomenon being investigated.

Design/methodology/approach

Theoretical sampling is a well-known method in grounded theory studies to seek additional data based on concepts developed from initial data analysis. This method involves following where the data have led to expand and refine the evolving theory during the analytical process. However, there is a dearth of information detailing the practical steps needed to undertake theoretical sampling.

Findings

The authors used the theoretical sampling method in their study in four ways: asking additional interview questions and/or widening the scope of existing interview questions; recruiting participants with additional diversity of attributes within the same group; and adding a new group of participants and expanding research settings.

Originality/value

Theoretical sampling is a valuable and practical method for the purpose of addressing gaps in the data in qualitative research. When using theoretical sampling, it is essential to consider potential strategies for countering challenges that may arise. Practical recommendations are offered on the use of theoretical sampling during data analysis, for the purpose of achieving theoretical integration.

Details

Qualitative Research Journal, vol. 20 no. 1
Type: Research Article
ISSN: 1443-9883

Keywords

Article
Publication date: 21 December 2021

Laouni Djafri

This work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P…

380

Abstract

Purpose

This work can be used as a building block in other settings such as GPU, Map-Reduce, Spark or any other. Also, DDPML can be deployed on other distributed systems such as P2P networks, clusters, clouds computing or other technologies.

Design/methodology/approach

In the age of Big Data, all companies want to benefit from large amounts of data. These data can help them understand their internal and external environment and anticipate associated phenomena, as the data turn into knowledge that can be used for prediction later. Thus, this knowledge becomes a great asset in companies' hands. This is precisely the objective of data mining. But with the production of a large amount of data and knowledge at a faster pace, the authors are now talking about Big Data mining. For this reason, the authors’ proposed works mainly aim at solving the problem of volume, veracity, validity and velocity when classifying Big Data using distributed and parallel processing techniques. So, the problem that the authors are raising in this work is how the authors can make machine learning algorithms work in a distributed and parallel way at the same time without losing the accuracy of classification results. To solve this problem, the authors propose a system called Dynamic Distributed and Parallel Machine Learning (DDPML) algorithms. To build it, the authors divided their work into two parts. In the first, the authors propose a distributed architecture that is controlled by Map-Reduce algorithm which in turn depends on random sampling technique. So, the distributed architecture that the authors designed is specially directed to handle big data processing that operates in a coherent and efficient manner with the sampling strategy proposed in this work. This architecture also helps the authors to actually verify the classification results obtained using the representative learning base (RLB). In the second part, the authors have extracted the representative learning base by sampling at two levels using the stratified random sampling method. This sampling method is also applied to extract the shared learning base (SLB) and the partial learning base for the first level (PLBL1) and the partial learning base for the second level (PLBL2). The experimental results show the efficiency of our solution that the authors provided without significant loss of the classification results. Thus, in practical terms, the system DDPML is generally dedicated to big data mining processing, and works effectively in distributed systems with a simple structure, such as client-server networks.

Findings

The authors got very satisfactory classification results.

Originality/value

DDPML system is specially designed to smoothly handle big data mining classification.

Details

Data Technologies and Applications, vol. 56 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 3 April 2018

Michael Link

Researchers now have more ways than ever before to capture information about groups of interest. In many areas, these are augmenting traditional survey approaches – in others, new…

1856

Abstract

Purpose

Researchers now have more ways than ever before to capture information about groups of interest. In many areas, these are augmenting traditional survey approaches – in others, new methods are potential replacements. This paper aims to explore three key trends: use of nonprobability samples, mobile data collection and administrative and “big data.”

Design/methodology/approach

Insights and lessons learned about these emerging trends are drawn from recent published articles and relevant scientific conference papers.

Findings

Each new trend has its own timeline in terms of methodological maturity. While mobile technologies for data capture are being rapidly adopted, particularly the use of internet-based surveys conducted on mobile devices, nonprobability sampling methods remain rare in most government research. Resource and quality pressures combined with the intensive research focus on new sampling methods, are, however, making nonprobability sampling a more attractive option. Finally, exploration of “big data” is becoming more common, although there are still many challenges to overcome – methodological, quality and access – before such data are used routinely.

Originality/value

This paper provides a timely review of recent developments in the field of data collection strategies, drawing on numerous current studies and practical applications in the field.

Details

Quality Assurance in Education, vol. 26 no. 2
Type: Research Article
ISSN: 0968-4883

Keywords

Book part
Publication date: 29 July 2009

Lynn Unruh, C. Allison Russo, H. Joanna Jiang and Carol Stocks

Background – Reliable and valid hospital nurse staffing measures are a major requirement for health services research. As the use of these measures increases, discussion is…

Abstract

Background – Reliable and valid hospital nurse staffing measures are a major requirement for health services research. As the use of these measures increases, discussion is growing as to whether current nurse staffing measures adequately meet the needs of health services researchers.

Objective – This study assesses whether the measures, sampling frameworks, and data sources meet the needs of health services research in areas such as staffing assessment; patient, nurse, and financial outcomes; and prediction of staffing.

Methods – We performed a systematic review of articles from 1990 through 2007, which use hospital nurse staffing measures in original research, or which address the validity, reliability, and availability of the measures. Taxonomies of measures, sampling frameworks, and sources were developed. Articles were analyzed to assess what measures, sampling strategies, and sources of data were used and to ascertain whether the measures, samples, and sources meet the needs of researchers.

Results – The review identified 107 articles that use hospital nurse staffing measures for original research. Multiple types of measures, some of which are used more often than others and some of which are more valid than others, exist in each of the following categories: staffing counts, staffing/patient load ratios, and skill mix. Sampling frameworks range from hospital units to all hospitals nationally, with all hospitals in a state being the most common. Data sources range from small-scale surveys to national databases. The American Hospital Association Annual Survey is the most frequently used data source, but there are limitations with its nurse staffing measures. Arguably, the multiplicity of measures and differences in sampling and data sources are due, in part, to data availability. The limitations noted by other researchers and by this review indicate that staffing measures need improvements in conceptualization, content, scope, and availability.

Discussion – Recommendations are made for improvements to research and administrative practice and to data.

Details

Biennial Review of Health Care Management: Meso Perspective
Type: Book
ISBN: 978-1-84855-673-7

Book part
Publication date: 9 May 2012

Caroline O. Ford and William R. Pasewark

We conduct an experiment to analyze the impact of a well-established psychological construct, need for cognition, in an audit-related decision context. By simulating a basic audit…

Abstract

We conduct an experiment to analyze the impact of a well-established psychological construct, need for cognition, in an audit-related decision context. By simulating a basic audit sampling task, we determine whether the desire to engage in a cognitive process influences decisions made during that task. Specifically, we investigate whether an individual's need for cognition influences the quantity of data collected, the revision of a predetermined sampling plan, and the time taken to make a decision. Additionally, we examine the impact of cost constraints during the decision-making process.

Contrary to results in previous studies, we find those with a higher need for cognition sought less data than those with a lower need for cognition to make an audit sampling decision. In addition, we find that the need for cognition had no relationship to sampling plan revisions or the time needed to make an audit sampling decision. Previous studies regarding the need for cognition did not utilize incremental costs for additional decision-making information. Potentially, these costs provided cognitive challenges that influenced decision outcomes.

Details

Advances in Accounting Behavioral Research
Type: Book
ISBN: 978-1-78052-758-1

Book part
Publication date: 10 April 2019

Luc Clair

Applied econometric analysis is often performed using data collected from large-scale surveys. These surveys use complex sampling plans in order to reduce costs and increase the…

Abstract

Applied econometric analysis is often performed using data collected from large-scale surveys. These surveys use complex sampling plans in order to reduce costs and increase the estimation efficiency for subgroups of the population. These sampling plans result in unequal inclusion probabilities across units in the population. The purpose of this paper is to derive the asymptotic properties of a design-based nonparametric regression estimator under a combined inference framework. The nonparametric regression estimator considered is the local constant estimator. This work contributes to the literature in two ways. First, it derives the asymptotic properties for the multivariate mixed-data case, including the asymptotic normality of the estimator. Second, I use least squares cross-validation for selecting the bandwidths for both continuous and discrete variables. I run Monte Carlo simulations designed to assess the finite-sample performance of the design-based local constant estimator versus the traditional local constant estimator for three sampling methods, namely, simple random sampling, exogenous stratification and endogenous stratification. Simulation results show that the estimator is consistent and that efficiency gains can be achieved by weighting observations by the inverse of their inclusion probabilities if the sampling is endogenous.

Details

The Econometrics of Complex Survey Data
Type: Book
ISBN: 978-1-78756-726-9

Keywords

Article
Publication date: 5 August 2014

Ralf T. Münnich and Jan Georg Seger

The purpose of this study is to show the importance of adequately considering quality measures within the use of composite indicators (CIs). Policy support often relies on high…

Abstract

Purpose

The purpose of this study is to show the importance of adequately considering quality measures within the use of composite indicators (CIs). Policy support often relies on high quality indicators. Often, the underlying data of relevant indicators are coming mainly from sample surveys. Obviously, the reliability of the indicators then heavily relies on the sampling design and other quality aspects.

Design/methodology/approach

Starting from the well-known work on sensitivity analysis of indicators, this study integrates the sampling process as an additional source of variability. The methodology is evaluated in a close-to-reality simulation environment using relevant and important surveys with different sampling designs. As an example, this study uses data related to the statistics of income and living conditions (SILC). The study is based on a design-based simulation framework.

Findings

In general, the normalisation method is dominating as source of the total variance of CI. In our study, we show that the sampling process also becomes rather relevant and generally dominates the influence of different weighting methods. We show that in some scenarios approximately 40 per cent of the variability in the sensitivity analysis comes from the sampling process. The quality of ranking derived from CIs then suffers considerably from the sampling design. When using data sources from different quality, e.g. in regional comparisons, one may expect some cases with biased CI values which may become useless for applications.

Research limitations/implications

The impact of sampling heavily depends on the data gathering process. In case of sample data, the sampling designs play an important role. However, the design effect still depends on the variables taken into account and has to be considered carefully.

Practical implications

The findings show the importance of considering the quality framework the European Code of Practice also for CIs. This additional information shall foster to understand possible over- or misinterpretations of CIs, especially when deriving rankings from the indicators. Specialised statistical methods shall be integrated in future research, particularly when focusing on regional indicators.

Originality/value

CIs are often used for policy monitoring. In general, the data gathering process is not considered adequately by end-users. This becomes especially important when being interested in regional indicators. The present paper shows possible implications of the sampling designs on CI outcomes with the focus on comparative studies.

Details

Sustainability Accounting, Management and Policy Journal, vol. 5 no. 3
Type: Research Article
ISSN: 2040-8021

Keywords

1 – 10 of over 184000