This chapter outlines how the comprehensive North American and European datasets were collected and explains the ensuing data cleaning process outlining three alternative methods applied to deal with missing values, the complete case, the multiple imputation (MI), and the K-nearest neighbor (KNN) methods. The complete case method is the conventional approach adopted in many mainstream management studies. We further discuss the implied assumption underlying use of this technique, which is rarely assessed, or tested in practice and explain the alternative imputation approaches in detail. Use of North American data is the norm but we also collected a European dataset, which is rarely done to enable subsequent comparative analysis between these geographical regions. We introduce the structure of firms organized within different industry classification schemes for use in the ensuing comparative analyses and provide base information on missing values in the original and cleaned datasets. The calculated performance indicators derived from the sampled data are defined and presented. We show how the three alternative approaches considered to deal with missing values have significantly different effects on the calculated performance measures in terms of extreme estimate ranges and mean performance values.
Andersen, T.J. (2023), "Collecting the Data
Emerald Publishing Limited
Copyright © 2023 Torben Juul Andersen