To read this content please select one of the options below:

HIOC: a hybrid imputation method to predict missing values in medical datasets

Pooja Rani (MMEC, Maharishi Markandeshwar (Deemed to be University), Mullana, India) (MMICT&BM, Maharishi Markandeshwar (Deemed to be University), Mullana, India)
Rajneesh Kumar (CSE, MMEC, Maharishi Markandeshwar (Deemed to be University), Mullana, India)
Anurag Jain (Virtualization Department, School of Computer Science, University of Petroleum and Energy Studies, Dehradun, India)

International Journal of Intelligent Computing and Cybernetics

ISSN: 1756-378X

Article publication date: 12 August 2021

Issue publication date: 4 October 2021

183

Abstract

Purpose

Decision support systems developed using machine learning classifiers have become a valuable tool in predicting various diseases. However, the performance of these systems is adversely affected by the missing values in medical datasets. Imputation methods are used to predict these missing values. In this paper, a new imputation method called hybrid imputation optimized by the classifier (HIOC) is proposed to predict missing values efficiently.

Design/methodology/approach

The proposed HIOC is developed by using a classifier to combine multivariate imputation by chained equations (MICE), K nearest neighbor (KNN), mean and mode imputation methods in an optimum way. Performance of HIOC has been compared to MICE, KNN, and mean and mode methods. Four classifiers support vector machine (SVM), naive Bayes (NB), random forest (RF) and decision tree (DT) have been used to evaluate the performance of imputation methods.

Findings

The results show that HIOC performed efficiently even with a high rate of missing values. It had reduced root mean square error (RMSE) up to 17.32% in the heart disease dataset and 34.73% in the breast cancer dataset. Correct prediction of missing values improved the accuracy of the classifiers in predicting diseases. It increased classification accuracy up to 18.61% in the heart disease dataset and 6.20% in the breast cancer dataset.

Originality/value

The proposed HIOC is a new hybrid imputation method that can efficiently predict missing values in any medical dataset.

Keywords

Citation

Rani, P., Kumar, R. and Jain, A. (2021), "HIOC: a hybrid imputation method to predict missing values in medical datasets", International Journal of Intelligent Computing and Cybernetics, Vol. 14 No. 4, pp. 598-616. https://doi.org/10.1108/IJICC-03-2021-0042

Publisher

:

Emerald Publishing Limited

Copyright © 2021, Emerald Publishing Limited

Related articles