To read this content please select one of the options below:

Development of a classifier with analysis of feature selection methods for COVID-19 diagnosis

Hetal Chauhan (Department of Engineering and Technology(CS&E), Ganpat University, Mehsana, India)
Kirit Modi (S. K. Patel University, Visnagar, India)
Saurabh Shrivastava (Marwadi University, Rajkot, India)

World Journal of Engineering

ISSN: 1708-5284

Article publication date: 28 May 2021

Issue publication date: 22 February 2022




The COVID-19 pandemic situation is increasing day by day and has affected the lifestyle and economy worldwide. Due to the absence of specific treatment, the only way to control a pandemic is by stopping its spread. Early identification of affected persons is urgently in demand. Diagnostic methods applied in hospitals are time-consuming, which delay the identification of positive patients. This study aims to develop machine learning-based diagnosis model which can predict positive cases and helps in decision-making.


In this research, the authors have developed a diagnosis model to check coronavirus positivity based on an artificial neural network. The authors have trained the model with clinically assessed symptoms, patient-reported symptoms, other medical histories and exposure data of the person. The authors have explored filter-based feature selection methods such as Chi2, ANOVA F-score and Mutual Information for improving performance of a classification model. Metrics used to evaluate performance of the model are accuracy, precision, sensitivity and F1-score.


The authors got highest classification performance with model trained with features ranked according to ANOVA FS method. Highest scores for accuracy, sensitivity, precision and F1-score of predictions are 0.93, 0.99, 0.94 and 0.93, respectively. The study reveals that most relevant predictors for COVID-19 diagnosis are sob severity, cough severity, sob presence, cough presence, fatigue and number of days since symptom onset.


Treatment for COVID-19 is not available to date. The best way to control this pandemic is the isolation of positive persons. It is very much necessary to identify positive persons at an early stage. RT-PCR test used to check COVID-19 positivity is the time-consuming, expensive and laborious method. Current diagnosis methods used in hospital demand more medical resources with increasing cases of coronavirus that introduce shortage of resources. The developed model provides solution to the problem cheaper and faster decreases the immediate need for medical resources and helps in decision-making.



The authors would like to thank carbon health for providing an open-source COVID-19 data repository. The authors would also like to thank Dr Vijay Ukani and Dr Parth Patel for their inputs and guidance.

Conflict of Interest: The authors declare no conflict of interest.


Chauhan, H., Modi, K. and Shrivastava, S. (2022), "Development of a classifier with analysis of feature selection methods for COVID-19 diagnosis", World Journal of Engineering, Vol. 19 No. 1, pp. 49-57.



Emerald Publishing Limited

Copyright © 2021, Emerald Publishing Limited

Related articles