To read this content please select one of the options below:

Quality of Big Data in health care

Sreenivas R. Sukumar (Computational Sciences and Engineering Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA)
Ramachandran Natarajan (Department of Decision Sciences and Management , Tennessee Technological University, Cookeville, TN, USA)
Regina K. Ferrell (Electrical and Electronics Systems Research Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA)

International Journal of Health Care Quality Assurance

ISSN: 0952-6862

Article publication date: 13 July 2015

6035

Abstract

Purpose

The current trend in Big Data analytics and in particular health information technology is toward building sophisticated models, methods and tools for business, operational and clinical intelligence. However, the critical issue of data quality required for these models is not getting the attention it deserves. The purpose of this paper is to highlight the issues of data quality in the context of Big Data health care analytics.

Design/methodology/approach

The insights presented in this paper are the results of analytics work that was done in different organizations on a variety of health data sets. The data sets include Medicare and Medicaid claims, provider enrollment data sets from both public and private sources, electronic health records from regional health centers accessed through partnerships with health care claims processing entities under health privacy protected guidelines.

Findings

Assessment of data quality in health care has to consider: first, the entire lifecycle of health data; second, problems arising from errors and inaccuracies in the data itself; third, the source(s) and the pedigree of the data; and fourth, how the underlying purpose of data collection impact the analytic processing and knowledge expected to be derived. Automation in the form of data handling, storage, entry and processing technologies is to be viewed as a double-edged sword. At one level, automation can be a good solution, while at another level it can create a different set of data quality issues. Implementation of health care analytics with Big Data is enabled by a road map that addresses the organizational and technological aspects of data quality assurance.

Practical implications

The value derived from the use of analytics should be the primary determinant of data quality. Based on this premise, health care enterprises embracing Big Data should have a road map for a systematic approach to data quality. Health care data quality problems can be so very specific that organizations might have to build their own custom software or data quality rule engines.

Originality/value

Today, data quality issues are diagnosed and addressed in a piece-meal fashion. The authors recommend a data lifecycle approach and provide a road map, that is more appropriate with the dimensions of Big Data and fits different stages in the analytical workflow.

Keywords

Acknowledgements

© This paper has been co-authored by employees of UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy. Accordingly, the US Government retains and the publisher, by accepting the paper for publication, acknowledges that the US Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US Government purposes The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

Citation

Sukumar, S.R., Natarajan, R. and Ferrell, R.K. (2015), "Quality of Big Data in health care", International Journal of Health Care Quality Assurance, Vol. 28 No. 6, pp. 621-634. https://doi.org/10.1108/IJHCQA-07-2014-0080

Publisher

:

Emerald Group Publishing Limited

Copyright © 2015, Company

Related articles