A new correlation-based approach for ensemble selection in random forests
International Journal of Intelligent Computing and Cybernetics
ISSN: 1756-378X
Article publication date: 23 March 2021
Issue publication date: 23 April 2021
Abstract
Purpose
Ensemble methods have been widely used in the field of pattern recognition due to the difficulty of finding a single classifier that performs well on a wide variety of problems. Despite the effectiveness of these techniques, studies have shown that ensemble methods generate a large number of hypotheses and that contain redundant classifiers in most cases. Several works proposed in the state of the art attempt to reduce all hypotheses without affecting performance.
Design/methodology/approach
In this work, the authors are proposing a pruning method that takes into consideration the correlation between classifiers/classes and each classifier with the rest of the set. The authors have used the random forest algorithm as trees-based ensemble classifiers and the pruning was made by a technique inspired by the CFS (correlation feature selection) algorithm.
Findings
The proposed method CES (correlation-based Ensemble Selection) was evaluated on ten datasets from the UCI machine learning repository, and the performances were compared to six ensemble pruning techniques. The results showed that our proposed pruning method selects a small ensemble in a smaller amount of time while improving classification rates compared to the state-of-the-art methods.
Originality/value
CES is a new ordering-based method that uses the CFS algorithm. CES selects, in a short time, a small sub-ensemble that outperforms results obtained from the whole forest and the other state-of-the-art techniques used in this study.
Keywords
Acknowledgements
The authors would like to thank the Directorate-General of Scientific Research and Technological Development (Direction Générale de la Recherche Scientifique et du Développement Technologique, DGRSDT, URL: www.dgrsdt.dz, Algeria) for the financial assistance towards this research.
Citation
El Habib Daho, M., Settouti, N., Bechar, M.E.A., Boublenza, A. and Chikh, M.A. (2021), "A new correlation-based approach for ensemble selection in random forests", International Journal of Intelligent Computing and Cybernetics, Vol. 14 No. 2, pp. 251-268. https://doi.org/10.1108/IJICC-10-2020-0147
Publisher
:Emerald Publishing Limited
Copyright © 2021, Emerald Publishing Limited