To read this content please select one of the options below:

Constructive Effect of Ranking Optimal Features Using Random Forest, SupportVector Machine and Naïve Bayes forBreast Cancer Diagnosis

Big Data Analytics and Intelligence: A Perspective for Health Care

ISBN: 978-1-83909-100-1, eISBN: 978-1-83909-099-8

Publication date: 30 September 2020

Abstract

Breast cancer (BC) is one of the leading cancer in the world, BC risk has been there for women of the middle age also, it is the malignant tumor. However, identifying BC in the early stage will save most of the women’s life. As there is an advancement in the technology research used Machine Learning (ML) algorithm Random Forest for ranking the feature, Support Vector Machine (SVM), and Naïve Bayes (NB) supervised classifiers for selection of best optimized features and prediction of BC accuracy. The estimation of prediction accuracy has been done by using the dataset Wisconsin Breast Cancer Data from University of California Irvine (UCI) ML repository. To perform all these operation, Anaconda one of the open source distribution of Python has been used. The proposed work resulted in extemporize improvement in the NB and SVM classifier accuracy. The performance evaluation of the proposed model is estimated by using classification accuracy, confusion matrix, mean, standard deviation, variance, and root mean-squared error.

The experimental results shows that 70-30 data split will result in best accuracy. SVM acts as a feature optimizer of 12 best features with the result of 97.66% accuracy and improvement of 1.17% after feature reduction. NB results with feature optimizer 17 of best features with the result of 96.49% accuracy and improvement of 1.17% after feature reduction.

The study shows that proposal model works very effectively as compare to the existing models with respect to accuracy measures.

Keywords

Citation

Deepa, B.G. and Senthil, S. (2020), "Constructive Effect of Ranking Optimal Features Using Random Forest, SupportVector Machine and Naïve Bayes forBreast Cancer Diagnosis", Tanwar, P., Jain, V., Liu, C.-M. and Goyal, V. (Ed.) Big Data Analytics and Intelligence: A Perspective for Health Care, Emerald Publishing Limited, Leeds, pp. 189-202. https://doi.org/10.1108/978-1-83909-099-820201014

Publisher

:

Emerald Publishing Limited

Copyright © 2020 Emerald Publishing Limited