To read this content please select one of the options below:

Stages and Methods for Cleaning Large Secondary Data Using R

Methodological Issues in Management Research: Advances, Challenges, and the Way Ahead

ISBN: 978-1-78973-974-9, eISBN: 978-1-78973-973-2

Publication date: 11 November 2019

Abstract

Data, either in primary or secondary form, represent the core strength of quantitative research. However, there is significant difference between collected data and the final researchable data. The data collection is driven by objectives of the research. The data also could be in various formats at different sources. The collected data in its original form may contain systematic and random errors. Such errors need to be cleaned from the data which is termed as data cleaning process.

The present chapter discusses about the different methodologies and steps that may be helpful for fine tuning the data into researchable format. The discussions are instantiated with the applications of methodologies on a set of financial data of companies listed in Bombay Stock Exchange. Various steps involved in transformation of collected data to researchable data are presented. A schematic model including data collection, data cleaning, working with variables, outlier treatment, testing the assumption of statistical test, normality, and heteroscedasticity is presented for the benefit of research scholars. Beyond this generic model, this paper focuses exclusively on financial data of listed companies in the Bombay Stock Exchange. The challenges involved in various sources, data gathering and other pre-analysis stages are also considered. This is also applicable for research based on secondary data sources in other fields as well.

Keywords

Citation

Jena, M.K. and Kar, B. (2019), "Stages and Methods for Cleaning Large Secondary Data Using R", Subudhi, R.N. and Mishra, S. (Ed.) Methodological Issues in Management Research: Advances, Challenges, and the Way Ahead, Emerald Publishing Limited, Leeds, pp. 285-304. https://doi.org/10.1108/978-1-78973-973-220191018

Publisher

:

Emerald Publishing Limited

Copyright © 2020 Emerald Publishing Limited