Quality of Big Data in health care
International Journal of Health Care Quality Assurance
ISSN: 0952-6862
Article publication date: 13 July 2015
Abstract
Purpose
The current trend in Big Data analytics and in particular health information technology is toward building sophisticated models, methods and tools for business, operational and clinical intelligence. However, the critical issue of data quality required for these models is not getting the attention it deserves. The purpose of this paper is to highlight the issues of data quality in the context of Big Data health care analytics.
Design/methodology/approach
The insights presented in this paper are the results of analytics work that was done in different organizations on a variety of health data sets. The data sets include Medicare and Medicaid claims, provider enrollment data sets from both public and private sources, electronic health records from regional health centers accessed through partnerships with health care claims processing entities under health privacy protected guidelines.
Findings
Assessment of data quality in health care has to consider: first, the entire lifecycle of health data; second, problems arising from errors and inaccuracies in the data itself; third, the source(s) and the pedigree of the data; and fourth, how the underlying purpose of data collection impact the analytic processing and knowledge expected to be derived. Automation in the form of data handling, storage, entry and processing technologies is to be viewed as a double-edged sword. At one level, automation can be a good solution, while at another level it can create a different set of data quality issues. Implementation of health care analytics with Big Data is enabled by a road map that addresses the organizational and technological aspects of data quality assurance.
Practical implications
The value derived from the use of analytics should be the primary determinant of data quality. Based on this premise, health care enterprises embracing Big Data should have a road map for a systematic approach to data quality. Health care data quality problems can be so very specific that organizations might have to build their own custom software or data quality rule engines.
Originality/value
Today, data quality issues are diagnosed and addressed in a piece-meal fashion. The authors recommend a data lifecycle approach and provide a road map, that is more appropriate with the dimensions of Big Data and fits different stages in the analytical workflow.
Keywords
Acknowledgements
© This paper has been co-authored by employees of UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy. Accordingly, the US Government retains and the publisher, by accepting the paper for publication, acknowledges that the US Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US Government purposes The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
Citation
Sukumar, S.R., Natarajan, R. and Ferrell, R.K. (2015), "Quality of Big Data in health care", International Journal of Health Care Quality Assurance, Vol. 28 No. 6, pp. 621-634. https://doi.org/10.1108/IJHCQA-07-2014-0080
Publisher
:Emerald Group Publishing Limited
Copyright © 2015, Company