Search results

1 – 10 of over 1000
Article
Publication date: 24 August 2018

Jewoo Kim and Jongho Im

The purpose of this paper is to introduce a new multiple imputation method that can effectively manage missing values in online review data, thereby allowing the online review…

Abstract

Purpose

The purpose of this paper is to introduce a new multiple imputation method that can effectively manage missing values in online review data, thereby allowing the online review analysis to yield valid results by using all available data.

Design/methodology/approach

This study develops a missing data method based on the multivariate imputation chained equation to generate imputed values for online reviews. Sentiment analysis is used to incorporate customers’ textual opinions as the auxiliary information in the imputation procedures. To check the validity of the proposed imputation method, the authors apply this method to missing values of sub-ratings on hotel attributes in both the simulated and real Honolulu hotel review data sets. The estimation results are compared to those of different missing data techniques, namely, listwise deletion and conventional multiple imputation which does not consider text reviews.

Findings

The findings from the simulation analysis show that the imputation method of the authors produces more efficient and less biased estimates compared to the other two missing data techniques when text reviews are possibly associated with the rating scores and response mechanism. When applying the imputation method to the real hotel review data, the findings show that the text sentiment-based propensity score can effectively explain the missingness of sub-ratings on hotel attributes, and the imputation method considering those propensity scores has better estimation results than the other techniques as in the simulation analysis.

Originality/value

This study extends multiple imputation to online data considering its spontaneous and unstructured nature. This new method helps make the fuller use of the observed online data while avoiding potential missing problems.

Details

International Journal of Contemporary Hospitality Management, vol. 30 no. 11
Type: Research Article
ISSN: 0959-6119

Keywords

Book part
Publication date: 23 November 2011

Gayaneh Kyureghian, Oral Capps and Rodolfo M. Nayga

The objective of this research is to examine, validate, and recommend techniques for handling the problem of missingness in observational data. We use a rich observational data…

Abstract

The objective of this research is to examine, validate, and recommend techniques for handling the problem of missingness in observational data. We use a rich observational data set, the Nielsen HomeScan data set, which allows us to effectively combine elements from simulated data sets: large numbers of observations, large number of data sets and variables, allowing elements of “design” that typically come with simulated data, and its observational nature. We created random 20% and 50% uniform missingness in our data sets and employed several widely used methods of single imputation, such as mean, regression, and stochastic regression imputations, and multiple imputation methods to fill in the data gaps. We compared these methods by measuring the error of predicting the missing values and the parameter estimates from the subsequent regression analysis using the imputed values. We also compared coverage or the percentages of intervals that covered the true parameter in both cases. Based on our results, the method of single regression or conditional mean imputation provided the best predictions of the missing price values with 28.34 and 28.59 mean absolute percent errors in 20% and 50% missingness settings, respectively. The imputation from conditional distribution method had the best rate of coverage. The parameter estimates based on data sets imputed by conditional mean method were consistently unbiased and had the smallest standard deviations. The multiple imputation methods had the best coverage of both the parameter estimates and predictions of the dependent variable.

Details

Missing Data Methods: Cross-sectional Methods and Applications
Type: Book
ISBN: 978-1-78052-525-9

Keywords

Article
Publication date: 9 May 2016

Sanna Sintonen, Anssi Tarkiainen, John W. Cadogan, Olli Kuivalainen, Nick Lee and Sanna Sundqvist

The purpose of this paper is to focus on the case where – by design – one needs to impute cross-country cross-survey (CCCS) data (situation typical for example among multinational…

1473

Abstract

Purpose

The purpose of this paper is to focus on the case where – by design – one needs to impute cross-country cross-survey (CCCS) data (situation typical for example among multinational firms who are confronted with the need to carry out comparative marketing surveys with respondents located in several countries). Importantly, while some work demonstrates approaches for single-item direct measures, no prior research has examined the common situation in international marketing where the researcher needs to use multi-item scales of latent constructs. The paper presents problem areas related to the choices international marketers have to make when doing cross-country/cross-survey research and provides guidance for future research.

Design/methodology/approach

Multi-country sample of real data is used as an example of cross-sample imputation (292 New Zealand exporters and 302 Finnish ones) the international entrepreneurial orientation (IEO) data. Three variations of the input data are tested: first, imputation based on all the data available for the measurement model; second, imputation based on the set of items based on the invariance structure of the joint items shared across the two groups; and third, imputation based both on examination of the invariance structures of the joint items and the performance of the measurement model in the group where the full data was originally available.

Findings

Based on distribution comparisons imputation for New Zealand after completing the measurement model with Finnish data (Model C) gave the most promising results. Consequently, using knowledge on between country measurement qualities may improve the imputation results, but this benefit comes with a downside since it simultaneously reduces the amount of data used for imputation. None of the imputation models leads to the same statistical inferences about covariances between latent constructs than as the original full data, however.

Research limitations/implications

Considering multiple imputation, the present exploratory study suggests that there are several concerns and issues that should be taken into account when planning CCCSs (or split questionnaire or sub-sampling designs). Even if there are several advantages available for well-implemented CCCS designs such as shorter questionnaires and improved response rates, these concerns lead us to question the appropriateness of the CCCS approach in general, due to the need to impute across the samples.

Originality/value

The combination of cross-country and cross-survey approaches is novel to international marketing, and it is not known how the different procedures utilized in imputation affect the results and their validity and reliability. The authors demonstrate the consequences of the various imputation strategy choices taken by using a real example of a two-country sample. The exploration may have significant implications to international marketing researchers and the paper offers stimulus for further research in the area.

Details

International Marketing Review, vol. 33 no. 3
Type: Research Article
ISSN: 0265-1335

Keywords

Article
Publication date: 1 November 2003

Marvin L. Brown and John F. Kros

The actual data mining process deals significantly with prediction, estimation, classification, pattern recognition and the development of association rules. Therefore, the…

7090

Abstract

The actual data mining process deals significantly with prediction, estimation, classification, pattern recognition and the development of association rules. Therefore, the significance of the analysis depends heavily on the accuracy of the database and on the chosen sample data to be used for model training and testing. Data mining is based upon searching the concatenation of multiple databases that usually contain some amount of missing data along with a variable percentage of inaccurate data, pollution, outliers and noise. The issue of missing data must be addressed since ignoring this problem can introduce bias into the models being evaluated and lead to inaccurate data mining conclusions. The objective of this research is to address the impact of missing data on the data mining process.

Details

Industrial Management & Data Systems, vol. 103 no. 8
Type: Research Article
ISSN: 0263-5577

Keywords

Article
Publication date: 12 October 2020

Ibrahim Said Ahmad, Azuraliza Abu Bakar, Mohd Ridzwan Yaakub and Mohammad Darwich

Sequel movies are very popular; however, there are limited studies on sequel movie revenue prediction. The purpose of this paper is to propose a sentiment analysis based model for…

Abstract

Purpose

Sequel movies are very popular; however, there are limited studies on sequel movie revenue prediction. The purpose of this paper is to propose a sentiment analysis based model for sequel movie revenue prediction and to propose a missing value imputation method for the sequel revenue prediction dataset.

Design/methodology/approach

A sequel of a successful movie will most likely also be successful. Therefore, we propose a supervised learning approach in which data are created from sequel movies to predict the box-office revenue of an upcoming sequel. The algorithms used in the prediction are multiple linear regression, support vector machine and multilayer perceptron neural network.

Findings

The results show that using four sequel movies in a franchise to predict the box-office revenue of a fifth sequel achieved better prediction than using three sequels, which was also better than using two sequel movies.

Research limitations/implications

The model produced will be beneficial to movie producers and other stakeholders in the movie industry in deciding the viability of producing a movie sequel.

Originality/value

Previous studies do not give priority to sequel movies in movie revenue prediction. Additionally, a new missing value imputation method was introduced. Finally, sequel movie revenue prediction dataset was prepared.

Details

Data Technologies and Applications, vol. 54 no. 5
Type: Research Article
ISSN: 2514-9288

Keywords

Book part
Publication date: 29 September 2023

Torben Juul Andersen

This chapter outlines how the comprehensive North American and European datasets were collected and explains the ensuing data cleaning process outlining three alternative methods…

Abstract

This chapter outlines how the comprehensive North American and European datasets were collected and explains the ensuing data cleaning process outlining three alternative methods applied to deal with missing values, the complete case, the multiple imputation (MI), and the K-nearest neighbor (KNN) methods. The complete case method is the conventional approach adopted in many mainstream management studies. We further discuss the implied assumption underlying use of this technique, which is rarely assessed, or tested in practice and explain the alternative imputation approaches in detail. Use of North American data is the norm but we also collected a European dataset, which is rarely done to enable subsequent comparative analysis between these geographical regions. We introduce the structure of firms organized within different industry classification schemes for use in the ensuing comparative analyses and provide base information on missing values in the original and cleaned datasets. The calculated performance indicators derived from the sampled data are defined and presented. We show how the three alternative approaches considered to deal with missing values have significantly different effects on the calculated performance measures in terms of extreme estimate ranges and mean performance values.

Details

A Study of Risky Business Outcomes: Adapting to Strategic Disruption
Type: Book
ISBN: 978-1-83797-074-2

Keywords

Abstract

Details

Transport Survey Quality and Innovation
Type: Book
ISBN: 978-0-08-044096-5

Content available
Book part
Publication date: 10 April 2019

Abstract

Details

The Econometrics of Complex Survey Data
Type: Book
ISBN: 978-1-78756-726-9

Article
Publication date: 20 May 2022

Cecilia M. Votta and Patricia J. Deldin

The purpose of this paper is to test a mental wellness intervention, Mood Lifters (ML), that addresses significant barriers to mental health care. ML includes adults over 18…

Abstract

Purpose

The purpose of this paper is to test a mental wellness intervention, Mood Lifters (ML), that addresses significant barriers to mental health care. ML includes adults over 18 struggling with mental wellness or any life difficulties, except those with active suicidality, mania and psychosis, and addresses barriers to care using peer leaders in a manualized group format with a gamified point system.

Design/methodology/approach

Participants were recruited using online postings. Those eligible (76% female, 80% white) were randomly assigned to professional-led groups (N = 30), peer-led groups (N = 33) or a waitlist (N = 22; i.e. attended assigned condition if available). Participants completed pre- and postgroup measures (including the Patient Health Questionnaire-9, Generalized Anxiety Disorder-7 and Perceived Stress Scale), attended 15 weekly meetings and tracked “points” or at-home skills practice. Multiple imputation was used to account for attrition. Linear regressions were analyzed to determine the program’s impact on anxiety and depressive symptoms and perceived stress. Further analyses included comparisons between peer- and professional-led groups.

Findings

Participants in ML experienced significant reductions in anxiety symptoms. Completing more homework across the program led to significant reductions in anxiety and perceived stress. Finally, there were no significant differences in attendance, homework completed or outcomes between peer- and professional-led groups.

Practical implications

Overall, participation in the ML program led to reduced anxiety symptoms, and for those who completed more homework, reduced perceived stress. More accessible programs can make a significant impact on symptoms and are critical to address the overburdened care system. Additionally, there were no differences between leader types indicating that peers may be an effective way to address accessibility concerns.

Originality/value

ML is unique for three reasons: it takes a biopsychosocial/Research Domain Criteria approach to mental wellness (i.e. incorporates many areas relevant to mental health, does not focus on a specific diagnosis), overcomes major barriers to mental health care and uses a peer-delivery model. These attributes, taken together with the results of this study, present a care alternative for those with less access.

Details

Mental Health Review Journal, vol. 27 no. 4
Type: Research Article
ISSN: 1361-9322

Keywords

Article
Publication date: 5 March 2018

Sari Mansour and Diane-Gabrielle Tremblay

The purpose of this paper is to examine a multidimensional mediating model of psychosocial safety climate (PSC) and work-family interference. More precisely, it tests the direct…

Abstract

Purpose

The purpose of this paper is to examine a multidimensional mediating model of psychosocial safety climate (PSC) and work-family interference. More precisely, it tests the direct and indirect effects of PSC on work-family conflict (WFC)/family-work conflict (FWC)-time and WFC/FWC-strain via family-supportive supervisor behavior (FSSB).

Design/methodology/approach

The structural equation method was used to test the direct effect of PSC on WFC/FWC time and strain. As for the mediation effects, they were tested by the method of indirect effects based on a bootstrap analysis (Preacher and Hayes, 2004) based on 3,000 replications with a 95% confidence interval. The statistical treatments were carried out with the AMOS software V.22.

Findings

The results show that PSC is negatively and directly related to WFC-time, FWC-time, WFC-strain and FWC-strain. In addition, the bootstrap analyses indicate that PSC is related indirectly to WFC-time, FWC-time, WFC-strain and FWC-strain via FSSB.

Practical implications

WFC is a workplace issue that warrants intervention in order to reduce organizational costs and increase worker well-being and PSC should be considered as an appropriate target for intervention (Dollard et al., 2012). However, although this management tool can be useful to reduce FWC, it is more appropriate to decrease WFC. Employers and HR managers not only should understand from the findings the importance of PSC, but also that all employees do not have the same problems, depending on the level of responsibilities at home, for example. Hence, they should offer the appropriate resources according to the need of workers. Indeed, the implementation of a unique work-family measure may not be appropriate for all workers, and it is important that employers and HR managers understand the details of WFC and FWC, as well as the possible effects of a series of different variables, in order to design the best work-family programs.

Originality/value

This research examined the effects of two new and specific resources at work, which are PSC and FSSB on WFC and FWC (time and strain), as recommended by Kossek et al. (2011). In addition, this study tested a new multidimensional mediating model which examined the mediation role of FSSB between PSC and time- and strain-based WFC and FWC. To the authors’ knowledge, this is the first study to examine these relations. Moreover, the test of the concepts of PSC in this study provides a support for the theory of conservation of resources and proposes an extension of this theory.

1 – 10 of over 1000