Search results

1 – 10 of over 1000
To view the access options for this content please click here
Article
Publication date: 22 November 2010

Yun‐Sheng Chung, D. Frank Hsu, Chun‐Yi Liu and Chun‐Yi Tang

Multiple classifier systems have been used widely in computing, communications, and informatics. Combining multiple classifier systems (MCS) has been shown to outperform a…

Abstract

Purpose

Multiple classifier systems have been used widely in computing, communications, and informatics. Combining multiple classifier systems (MCS) has been shown to outperform a single classifier system. It has been demonstrated that improvement in ensemble performance depends on either the diversity among or the performance of individual systems. A variety of diversity measures and ensemble methods have been proposed and studied. However, it remains a challenging problem to estimate the ensemble performance in terms of the performance of and the diversity among individual systems. The purpose of this paper is to study the general problem of estimating ensemble performance for various combination methods using the concept of a performance distribution pattern (PDP).

Design/methodology/approach

In particular, the paper establishes upper and lower bounds for majority voting ensemble performance with disagreement diversity measure Dis, weighted majority voting performance in terms of weighted average performance and weighted disagreement diversity, and plurality voting ensemble performance with entropy diversity measure D.

Findings

Bounds for these three cases are shown to be tight using the PDP for the input set.

Originality/value

As a consequence of the authors' previous results on diversity equivalence, the results of majority voting ensemble performance can be extended to several other diversity measures. Moreover, the paper showed in the case of majority voting ensemble performance that when the average of individual systems performance P is big enough, the ensemble performance Pm resulting from a maximum (information‐theoretic) entropy PDP is an increasing function with respect to the disagreement diversity Dis. Eight experiments using data sets from various application domains are conducted to demonstrate the complexity, richness, and diverseness of the problem in estimating the ensemble performance.

Details

International Journal of Pervasive Computing and Communications, vol. 6 no. 4
Type: Research Article
ISSN: 1742-7371

Keywords

To view the access options for this content please click here
Article
Publication date: 28 May 2021

Zhibin Xiong and Jun Huang

Ensemble models that combine multiple base classifiers have been widely used to improve prediction performance in credit risk evaluation. However, an arbitrary selection…

Abstract

Purpose

Ensemble models that combine multiple base classifiers have been widely used to improve prediction performance in credit risk evaluation. However, an arbitrary selection of base classifiers is problematic. The purpose of this paper is to develop a framework for selecting base classifiers to improve the overall classification performance of an ensemble model.

Design/methodology/approach

In this study, selecting base classifiers is treated as a feature selection problem, where the output from a base classifier can be considered a feature. The proposed correlation-based classifier selection using the maximum information coefficient (MIC-CCS), a correlation-based classifier selection under the maximum information coefficient method, selects the features (classifiers) using nonlinear optimization programming, which seeks to optimize the relationship between the accuracy and diversity of base classifiers, based on MIC.

Findings

The empirical results show that ensemble models perform better than stand-alone ones, whereas the ensemble model based on MIC-CCS outperforms the ensemble models with unselected base classifiers and other ensemble models based on traditional forward and backward selection methods. Additionally, the classification performance of the ensemble model in which correlation is measured with MIC is better than that measured with the Pearson correlation coefficient.

Research limitations/implications

The study provides an alternate solution to effectively select base classifiers that are significantly different, so that they can provide complementary information and, as these selected classifiers have good predictive capabilities, the classification performance of the ensemble model is improved.

Originality/value

This paper introduces MIC to the correlation-based selection process to better capture nonlinear and nonfunctional relationships in a complex credit data structure and construct a novel nonlinear programming model for base classifiers selection that has not been used in other studies.

To view the access options for this content please click here
Article
Publication date: 5 December 2017

Rabeb Faleh, Sami Gomri, Mehdi Othman, Khalifa Aguir and Abdennaceur Kachouri

In this paper, a novel hybrid approach aimed at solving the problem of cross-selectivity of gases in electronic nose (E-nose) using the combination classifiers of support…

Abstract

Purpose

In this paper, a novel hybrid approach aimed at solving the problem of cross-selectivity of gases in electronic nose (E-nose) using the combination classifiers of support vector machine (SVM) and k-nearest neighbors (KNN) methods was proposed.

Design/methodology/approach

First, three WO3 sensors E-nose system was used for data acquisition to detect three gases, namely, ozone, ethanol and acetone. Then, two transient parameters, derivate and integral, were extracted for each gas response. Next, the principal component analysis (PCA) was been applied to extract the most relevant sensor data and dimensionality reduction. The new coordinates calculated by PCA were used as inputs for classification by the SVM method. Finally, the classification achieved by the KNN method was carried out to calculate only the support vectors (SVs), not all the data.

Findings

This work has proved that the proposed fusion method led to the highest classification rate (100 per cent) compared to the accuracy of the individual classifiers: KNN, SVM-linear, SVM-RBF, SVM-polynomial that present, respectively, 89, 75.2, 80 and 79.9 per cent as classification rate.

Originality/value

The authors propose a fusion classifier approach to improve the classification rate. In this method, the extracted features are projected into the PCA subspace to reduce the dimensionality. Then, the obtained principal components are introduced to the SVM classifier and calculated SVs which will be used in the KNN method.

Details

Sensor Review, vol. 38 no. 1
Type: Research Article
ISSN: 0260-2288

Keywords

To view the access options for this content please click here
Article
Publication date: 28 October 2014

Kyle Dillon Feuz and Diane J. Cook

The purpose of this paper is to study heterogeneous transfer learning for activity recognition using heuristic search techniques. Many pervasive computing applications…

Abstract

Purpose

The purpose of this paper is to study heterogeneous transfer learning for activity recognition using heuristic search techniques. Many pervasive computing applications require information about the activities currently being performed, but activity recognition algorithms typically require substantial amounts of labeled training data for each setting. One solution to this problem is to leverage transfer learning techniques to reuse available labeled data in new situations.

Design/methodology/approach

This paper introduces three novel heterogeneous transfer learning techniques that reverse the typical transfer model and map the target feature space to the source feature space and apply them to activity recognition in a smart apartment. This paper evaluates the techniques on data from 18 different smart apartments located in an assisted-care facility and compares the results against several baselines.

Findings

The three transfer learning techniques are all able to outperform the baseline comparisons in several situations. Furthermore, the techniques are successfully used in an ensemble approach to achieve even higher levels of accuracy.

Originality/value

The techniques in this paper represent a considerable step forward in heterogeneous transfer learning by removing the need to rely on instance – instance or feature – feature co-occurrence data.

Details

International Journal of Pervasive Computing and Communications, vol. 10 no. 4
Type: Research Article
ISSN: 1742-7371

Keywords

To view the access options for this content please click here
Article
Publication date: 31 July 2019

Zhe Zhang and Yue Dai

For classification problems of customer relationship management (CRM), the purpose of this paper is to propose a method with interpretability of the classification results…

Abstract

Purpose

For classification problems of customer relationship management (CRM), the purpose of this paper is to propose a method with interpretability of the classification results that combines multiple decision trees based on a genetic algorithm.

Design/methodology/approach

In the proposed method, multiple decision trees are combined in parallel. Subsequently, a genetic algorithm is used to optimize the weight matrix in the combination algorithm.

Findings

The method is applied to customer credit rating assessment and customer response behavior pattern recognition. The results demonstrate that compared to a single decision tree, the proposed combination method improves the predictive accuracy and optimizes the classification rules, while maintaining interpretability of the classification results.

Originality/value

The findings of this study contribute to research methodologies in CRM. It specifically focuses on a new method with interpretability by combining multiple decision trees based on genetic algorithms for customer classification.

Details

Asia Pacific Journal of Marketing and Logistics, vol. 32 no. 5
Type: Research Article
ISSN: 1355-5855

Keywords

To view the access options for this content please click here
Article
Publication date: 23 March 2021

Mostafa El Habib Daho, Nesma Settouti, Mohammed El Amine Bechar, Amina Boublenza and Mohammed Amine Chikh

Ensemble methods have been widely used in the field of pattern recognition due to the difficulty of finding a single classifier that performs well on a wide variety of…

Abstract

Purpose

Ensemble methods have been widely used in the field of pattern recognition due to the difficulty of finding a single classifier that performs well on a wide variety of problems. Despite the effectiveness of these techniques, studies have shown that ensemble methods generate a large number of hypotheses and that contain redundant classifiers in most cases. Several works proposed in the state of the art attempt to reduce all hypotheses without affecting performance.

Design/methodology/approach

In this work, the authors are proposing a pruning method that takes into consideration the correlation between classifiers/classes and each classifier with the rest of the set. The authors have used the random forest algorithm as trees-based ensemble classifiers and the pruning was made by a technique inspired by the CFS (correlation feature selection) algorithm.

Findings

The proposed method CES (correlation-based Ensemble Selection) was evaluated on ten datasets from the UCI machine learning repository, and the performances were compared to six ensemble pruning techniques. The results showed that our proposed pruning method selects a small ensemble in a smaller amount of time while improving classification rates compared to the state-of-the-art methods.

Originality/value

CES is a new ordering-based method that uses the CFS algorithm. CES selects, in a short time, a small sub-ensemble that outperforms results obtained from the whole forest and the other state-of-the-art techniques used in this study.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 14 no. 2
Type: Research Article
ISSN: 1756-378X

Keywords

To view the access options for this content please click here
Book part
Publication date: 30 September 2020

Hera Khan, Ayush Srivastav and Amit Kumar Mishra

A detailed description will be provided of all the classification algorithms that have been widely used in the domain of medical science. The foundation will be laid by…

Abstract

A detailed description will be provided of all the classification algorithms that have been widely used in the domain of medical science. The foundation will be laid by giving a comprehensive overview pertaining to the background and history of the classification algorithms. This will be followed by an extensive discussion regarding various techniques of classification algorithm in machine learning (ML) hence concluding with their relevant applications in data analysis in medical science and health care. To begin with, the initials of this chapter will deal with the basic fundamentals required for a profound understanding of the classification techniques in ML which will comprise of the underlying differences between Unsupervised and Supervised Learning followed by the basic terminologies of classification and its history. Further, it will include the types of classification algorithms ranging from linear classifiers like Logistic Regression, Naïve Bayes to Nearest Neighbour, Support Vector Machine, Tree-based Classifiers, and Neural Networks, and their respective mathematics. Ensemble algorithms such as Majority Voting, Boosting, Bagging, Stacking will also be discussed at great length along with their relevant applications. Furthermore, this chapter will also incorporate comprehensive elucidation regarding the areas of application of such classification algorithms in the field of biomedicine and health care and their contribution to decision-making systems and predictive analysis. To conclude, this chapter will devote highly in the field of research and development as it will provide a thorough insight to the classification algorithms and their relevant applications used in the cases of the healthcare development sector.

Details

Big Data Analytics and Intelligence: A Perspective for Health Care
Type: Book
ISBN: 978-1-83909-099-8

Keywords

To view the access options for this content please click here
Article
Publication date: 6 May 2020

Rajeshwari S. Patil and Nagashettappa Biradar

Breast cancer is one of the most common malignant tumors in women, which badly have an effect on women's physical and psychological health and even danger to life…

Abstract

Purpose

Breast cancer is one of the most common malignant tumors in women, which badly have an effect on women's physical and psychological health and even danger to life. Nowadays, mammography is considered as a fundamental criterion for medical practitioners to recognize breast cancer. Though, due to the intricate formation of mammogram images, it is reasonably hard for practitioners to spot breast cancer features.

Design/methodology/approach

Breast cancer is one of the most common malignant tumors in women, which badly have an effect on women's physical and psychological health and even danger to life. Nowadays, mammography is considered as a fundamental criterion for medical practitioners to recognize breast cancer. Though, due to the intricate formation of mammogram images, it is reasonably hard for practitioners to spot breast cancer features.

Findings

The performance analysis was done for both segmentation and classification. From the analysis, the accuracy of the proposed IAP-CSA-based fuzzy was 41.9% improved than the fuzzy classifier, 2.80% improved than PSO, WOA, and CSA, and 2.32% improved than GWO-based fuzzy classifiers. Additionally, the accuracy of the developed IAP-CSA-fuzzy was 9.54% better than NN, 35.8% better than SVM, and 41.9% better than the existing fuzzy classifier. Hence, it is concluded that the implemented breast cancer detection model was efficient in determining the normal, benign and malignant images.

Originality/value

This paper adopts the latest Improved Awareness Probability-based Crow Search Algorithm (IAP-CSA)-based Region growing and fuzzy classifier for enhancing the breast cancer detection of mammogram images, and this is the first work that utilizes this method.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 13 no. 2
Type: Research Article
ISSN: 1756-378X

Keywords

To view the access options for this content please click here
Article
Publication date: 7 November 2016

Mohammadali Abedini, Farzaneh Ahmadzadeh and Rassoul Noorossana

A crucial decision in financial services is how to classify credit or loan applicants into good and bad applicants. The purpose of this paper is to propose a four-stage…

Abstract

Purpose

A crucial decision in financial services is how to classify credit or loan applicants into good and bad applicants. The purpose of this paper is to propose a four-stage hybrid data mining approach to support the decision-making process.

Design/methodology/approach

The approach is inspired by the bagging ensemble learning method and proposes a new voting method, namely two-level majority voting in the last stage. First some training subsets are generated. Then some different base classifiers are tuned and afterward some ensemble methods are applied to strengthen tuned classifiers. Finally, two-level majority voting schemes help the approach to achieve more accuracy.

Findings

A comparison of results shows the proposed model outperforms powerful single classifiers such as multilayer perceptron (MLP), support vector machine, logistic regression (LR). In addition, it is more accurate than ensemble learning methods such as bagging-LR or rotation forest (RF)-MLP. The model outperforms single classifiers in terms of type I and II errors; it is close to some ensemble approaches such as bagging-LR and RF-MLP but fails to outperform them in terms of type I and II errors. Moreover, majority voting in the final stage provides more reliable results.

Practical implications

The study concludes the approach would be beneficial for banks, credit card companies and other credit provider organisations.

Originality/value

A novel four stages hybrid approach inspired by bagging ensemble method proposed. Moreover the two-level majority voting in two different schemes in the last stage provides more accuracy. An integrated evaluation criterion for classification errors provides an enhanced insight for error comparisons.

Details

Kybernetes, vol. 45 no. 10
Type: Research Article
ISSN: 0368-492X

Keywords

To view the access options for this content please click here
Article
Publication date: 6 February 2017

Aytug Onan

The immense quantity of available unstructured text documents serve as one of the largest source of information. Text classification can be an essential task for many…

Abstract

Purpose

The immense quantity of available unstructured text documents serve as one of the largest source of information. Text classification can be an essential task for many purposes in information retrieval, such as document organization, text filtering and sentiment analysis. Ensemble learning has been extensively studied to construct efficient text classification schemes with higher predictive performance and generalization ability. The purpose of this paper is to provide diversity among the classification algorithms of ensemble, which is a key issue in the ensemble design.

Design/methodology/approach

An ensemble scheme based on hybrid supervised clustering is presented for text classification. In the presented scheme, supervised hybrid clustering, which is based on cuckoo search algorithm and k-means, is introduced to partition the data samples of each class into clusters so that training subsets with higher diversities can be provided. Each classifier is trained on the diversified training subsets and the predictions of individual classifiers are combined by the majority voting rule. The predictive performance of the proposed classifier ensemble is compared to conventional classification algorithms (such as Naïve Bayes, logistic regression, support vector machines and C4.5 algorithm) and ensemble learning methods (such as AdaBoost, bagging and random subspace) using 11 text benchmarks.

Findings

The experimental results indicate that the presented classifier ensemble outperforms the conventional classification algorithms and ensemble learning methods for text classification.

Originality/value

The presented ensemble scheme is the first to use supervised clustering to obtain diverse ensemble for text classification

Details

Kybernetes, vol. 46 no. 2
Type: Research Article
ISSN: 0368-492X

Keywords

1 – 10 of over 1000