Search results

11 – 20 of over 4000
Article
Publication date: 23 March 2021

Mostafa El Habib Daho, Nesma Settouti, Mohammed El Amine Bechar, Amina Boublenza and Mohammed Amine Chikh

Ensemble methods have been widely used in the field of pattern recognition due to the difficulty of finding a single classifier that performs well on a wide variety of problems…

Abstract

Purpose

Ensemble methods have been widely used in the field of pattern recognition due to the difficulty of finding a single classifier that performs well on a wide variety of problems. Despite the effectiveness of these techniques, studies have shown that ensemble methods generate a large number of hypotheses and that contain redundant classifiers in most cases. Several works proposed in the state of the art attempt to reduce all hypotheses without affecting performance.

Design/methodology/approach

In this work, the authors are proposing a pruning method that takes into consideration the correlation between classifiers/classes and each classifier with the rest of the set. The authors have used the random forest algorithm as trees-based ensemble classifiers and the pruning was made by a technique inspired by the CFS (correlation feature selection) algorithm.

Findings

The proposed method CES (correlation-based Ensemble Selection) was evaluated on ten datasets from the UCI machine learning repository, and the performances were compared to six ensemble pruning techniques. The results showed that our proposed pruning method selects a small ensemble in a smaller amount of time while improving classification rates compared to the state-of-the-art methods.

Originality/value

CES is a new ordering-based method that uses the CFS algorithm. CES selects, in a short time, a small sub-ensemble that outperforms results obtained from the whole forest and the other state-of-the-art techniques used in this study.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 14 no. 2
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 31 May 2022

Osamah M. Al-Qershi, Junbum Kwon, Shuning Zhao and Zhaokun Li

For the case of many content features, This paper aims to investigate which content features in video and text ads more contribute to accurately predicting the success of…

Abstract

Purpose

For the case of many content features, This paper aims to investigate which content features in video and text ads more contribute to accurately predicting the success of crowdfunding by comparing prediction models.

Design/methodology/approach

With 1,368 features extracted from 15,195 Kickstarter campaigns in the USA, the authors compare base models such as logistic regression (LR) with tree-based homogeneous ensembles such as eXtreme gradient boosting (XGBoost) and heterogeneous ensembles such as XGBoost + LR.

Findings

XGBoost shows higher prediction accuracy than LR (82% vs 69%), in contrast to the findings of a previous relevant study. Regarding important content features, humans (e.g. founders) are more important than visual objects (e.g. products). In both spoken and written language, words related to experience (e.g. eat) or perception (e.g. hear) are more important than cognitive (e.g. causation) words. In addition, a focus on the future is more important than a present or past time orientation. Speech aids (see and compare) to complement visual content are also effective and positive tone matters in speech.

Research limitations/implications

This research makes theoretical contributions by finding more important visuals (human) and language features (experience, perception and future time). Also, in a multimodal context, complementary cues (e.g. speech aids) across different modalities help. Furthermore, the noncontent parts of speech such as positive “tone” or pace of speech are important.

Practical implications

Founders are encouraged to assess and revise the content of their video or text ads as well as their basic campaign features (e.g. goal, duration and reward) before they launch their campaigns. Next, overly complex ensembles may suffer from overfitting problems. In practice, model validation using unseen data is recommended.

Originality/value

Rather than reducing the number of content feature dimensions (Kaminski and Hopp, 2020), by enabling advanced prediction models to accommodate many contents features, prediction accuracy rises substantially.

Article
Publication date: 7 March 2016

Stephan Körner and Frank Holzäpfel

Wake vortices that are generated by an aircraft as a consequence of lift constitute a potential danger to the following aircraft. To predict and avoid dangerous situations, wake…

Abstract

Purpose

Wake vortices that are generated by an aircraft as a consequence of lift constitute a potential danger to the following aircraft. To predict and avoid dangerous situations, wake vortex transport and decay models have been developed. Being based on different model physics, they can complement each other with their individual strengths. This paper investigates the skill of a Multi-Model Ensemble (MME) approach to improve prediction performance. Therefore, this paper aims to use wake vortex models developed by NASA (APA3.2, APA3.4, TDP2.1) and by DLR (P2P). Furthermore, this paper analyzes the possibility to use the ensemble spread to compute uncertainty envelopes.

Design/methodology/approach

An MME approach called Reliability Ensemble Averaging (REA) is adapted and used to the wake vortex predictions. To train the ensemble, a set of wake vortex measurements accomplished at the airports of Frankfurt (WakeFRA), Munich (WakeMUC) and at a special airport Oberpfaffenhofen was applied.

Findings

The REA approach can outperform the best member of the ensemble, on average, regarding the root-mean-square error. Moreover, the ensemble delivers reasonable uncertainty envelopes.

Practical implications

Reliable wake vortex predictions may be applicable for both tactical optimization of aircraft separation at airports and airborne wake vortex prediction and avoidance.

Originality/value

Ensemble approaches are widely used in weather forecasting, but they have never been applied to wake vortex predictions. Until today, the uncertainty envelopes for wake vortex forecasts have been computed among others from perturbed initial conditions or perturbed physics as well as from uncertainties from environmental conditions or from safety margins but not from the spread of structurally independent model forecasts.

Details

Aircraft Engineering and Aerospace Technology: An International Journal, vol. 88 no. 2
Type: Research Article
ISSN: 1748-8842

Keywords

Article
Publication date: 28 May 2021

Zhibin Xiong and Jun Huang

Ensemble models that combine multiple base classifiers have been widely used to improve prediction performance in credit risk evaluation. However, an arbitrary selection of base…

Abstract

Purpose

Ensemble models that combine multiple base classifiers have been widely used to improve prediction performance in credit risk evaluation. However, an arbitrary selection of base classifiers is problematic. The purpose of this paper is to develop a framework for selecting base classifiers to improve the overall classification performance of an ensemble model.

Design/methodology/approach

In this study, selecting base classifiers is treated as a feature selection problem, where the output from a base classifier can be considered a feature. The proposed correlation-based classifier selection using the maximum information coefficient (MIC-CCS), a correlation-based classifier selection under the maximum information coefficient method, selects the features (classifiers) using nonlinear optimization programming, which seeks to optimize the relationship between the accuracy and diversity of base classifiers, based on MIC.

Findings

The empirical results show that ensemble models perform better than stand-alone ones, whereas the ensemble model based on MIC-CCS outperforms the ensemble models with unselected base classifiers and other ensemble models based on traditional forward and backward selection methods. Additionally, the classification performance of the ensemble model in which correlation is measured with MIC is better than that measured with the Pearson correlation coefficient.

Research limitations/implications

The study provides an alternate solution to effectively select base classifiers that are significantly different, so that they can provide complementary information and, as these selected classifiers have good predictive capabilities, the classification performance of the ensemble model is improved.

Originality/value

This paper introduces MIC to the correlation-based selection process to better capture nonlinear and nonfunctional relationships in a complex credit data structure and construct a novel nonlinear programming model for base classifiers selection that has not been used in other studies.

Article
Publication date: 5 October 2015

Zheng Jiang, Haobo Qiu, Ming Zhao, Shizhan Zhang and Liang Gao

In multidisciplinary design optimization (MDO), if the relationships between design variables and some output parameters, which are important performance constraints, are complex…

Abstract

Purpose

In multidisciplinary design optimization (MDO), if the relationships between design variables and some output parameters, which are important performance constraints, are complex implicit problems, plenty of time should be spent on computationally expensive simulations to identify whether the implicit constraints are satisfied with the given design variables during the optimization iteration process. The purpose of this paper is to propose an ensemble of surrogates-based analytical target cascading (ESATC) method to tackle such MDO engineering design problems with reduced computational cost and high optimization accuracy.

Design/methodology/approach

Different surrogate models are constructed based on the sample point sets obtained by Latin hypercube sampling (LHS) method. Then, according to the error metric of each surrogate model, the repeated ensemble of surrogates is constructed to approximate the implicit objective functions and constraints. Under the framework of analytical target cascading (ATC), the MDO problem is decomposed into several optimization subproblems and the function of analysis module of each subproblem is simulated by repeated ensemble of surrogates, working together to find the optimum solution.

Findings

The proposed method shows better modeling accuracy and robustness than other individual surrogate model-based ATC method. A numerical benchmark problem and an industrial case study of the structural design of a super heavy vertical lathe machine tool are utilized to demonstrate the accuracy and efficiency of the proposed method.

Originality/value

This paper integrates a repeated ensemble method with ATC strategy to construct the ESATC framework which is an effective method to solve MDO problems with implicit constraints and black-box objectives.

Article
Publication date: 18 October 2018

Kalyan Nagaraj, Biplab Bhattacharjee, Amulyashree Sridhar and Sharvani GS

Phishing is one of the major threats affecting businesses worldwide in current times. Organizations and customers face the hazards arising out of phishing attacks because of…

Abstract

Purpose

Phishing is one of the major threats affecting businesses worldwide in current times. Organizations and customers face the hazards arising out of phishing attacks because of anonymous access to vulnerable details. Such attacks often result in substantial financial losses. Thus, there is a need for effective intrusion detection techniques to identify and possibly nullify the effects of phishing. Classifying phishing and non-phishing web content is a critical task in information security protocols, and full-proof mechanisms have yet to be implemented in practice. The purpose of the current study is to present an ensemble machine learning model for classifying phishing websites.

Design/methodology/approach

A publicly available data set comprising 10,068 instances of phishing and legitimate websites was used to build the classifier model. Feature extraction was performed by deploying a group of methods, and relevant features extracted were used for building the model. A twofold ensemble learner was developed by integrating results from random forest (RF) classifier, fed into a feedforward neural network (NN). Performance of the ensemble classifier was validated using k-fold cross-validation. The twofold ensemble learner was implemented as a user-friendly, interactive decision support system for classifying websites as phishing or legitimate ones.

Findings

Experimental simulations were performed to access and compare the performance of the ensemble classifiers. The statistical tests estimated that RF_NN model gave superior performance with an accuracy of 93.41 per cent and minimal mean squared error of 0.000026.

Research limitations/implications

The research data set used in this study is publically available and easy to analyze. Comparative analysis with other real-time data sets of recent origin must be performed to ensure generalization of the model against various security breaches. Different variants of phishing threats must be detected rather than focusing particularly toward phishing website detection.

Originality/value

The twofold ensemble model is not applied for classification of phishing websites in any previous studies as per the knowledge of authors.

Details

Journal of Systems and Information Technology, vol. 20 no. 3
Type: Research Article
ISSN: 1328-7265

Keywords

Article
Publication date: 5 October 2018

Xiaofang Guo, Hui Shi, Chenglong Wei and Xiao Dong Chen

The purpose of this paper is to reveal the unique thermal property of Mongolian clothing from the current western clothing and explain their environmental adaptation to the…

Abstract

Purpose

The purpose of this paper is to reveal the unique thermal property of Mongolian clothing from the current western clothing and explain their environmental adaptation to the climate of Mongolian plateau in China.

Design/methodology/approach

Thermal insulation and the temperature rating (TR) of eight Mongolian robe ensembles and two western clothing ensembles were investigated by manikin testing and wearing trials, respectively. The clothing area factor (fcl) of these Mongolian clothing was measured by photographic method and estimated equation from ISO 15831. Finally, the TR prediction model for Mongolian clothing was built and compared with current models for western clothing in ISO 7730 and for Tibetan clothing in previous article.

Findings

The results demonstrated that the total thermal insulation of Mongolian robe ensembles was much bigger than that of western clothing ensembles and ranged from 1.81clo to 3.11clo during the whole year. The fcl of the Mongolian clothing should be determined by photographic method because the differences between these two methods were much bigger from 0.6 to 13.9 percent; the TR prediction model for Mongolian robe ensembles is TR=25.57−7.13Icl, which revealed that the environmental adaptation of Mongolian clothing was much better than that of western clothing and similar to that of Tibetan clothing.

Originality/value

The research findings give a detailed information about the thermal property of China Mongolian clothing, and explain the environmental adaptation of Mongolian clothing to the cold and changing climate.

Details

International Journal of Clothing Science and Technology, vol. 30 no. 6
Type: Research Article
ISSN: 0955-6222

Keywords

Article
Publication date: 6 February 2017

Aytug Onan

The immense quantity of available unstructured text documents serve as one of the largest source of information. Text classification can be an essential task for many purposes in…

Abstract

Purpose

The immense quantity of available unstructured text documents serve as one of the largest source of information. Text classification can be an essential task for many purposes in information retrieval, such as document organization, text filtering and sentiment analysis. Ensemble learning has been extensively studied to construct efficient text classification schemes with higher predictive performance and generalization ability. The purpose of this paper is to provide diversity among the classification algorithms of ensemble, which is a key issue in the ensemble design.

Design/methodology/approach

An ensemble scheme based on hybrid supervised clustering is presented for text classification. In the presented scheme, supervised hybrid clustering, which is based on cuckoo search algorithm and k-means, is introduced to partition the data samples of each class into clusters so that training subsets with higher diversities can be provided. Each classifier is trained on the diversified training subsets and the predictions of individual classifiers are combined by the majority voting rule. The predictive performance of the proposed classifier ensemble is compared to conventional classification algorithms (such as Naïve Bayes, logistic regression, support vector machines and C4.5 algorithm) and ensemble learning methods (such as AdaBoost, bagging and random subspace) using 11 text benchmarks.

Findings

The experimental results indicate that the presented classifier ensemble outperforms the conventional classification algorithms and ensemble learning methods for text classification.

Originality/value

The presented ensemble scheme is the first to use supervised clustering to obtain diverse ensemble for text classification

Details

Kybernetes, vol. 46 no. 2
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 5 June 2009

Víctor M. González, Bonnie Nardi and Gloria Mark

An ensemble is an intermediate unit of work between action and activity in the hierarchical framework proposed by classical activity theory. Ensembles are the mid‐level of…

2584

Abstract

Purpose

An ensemble is an intermediate unit of work between action and activity in the hierarchical framework proposed by classical activity theory. Ensembles are the mid‐level of activity, offering more flexibility than objects, but more purposeful structure than actions. The paper aims to introduce the notion of ensembles to understand the way object‐related activities are instantiated in practice.

Design/methodology/approach

The paper presents an analysis of the practices of professional information workers in two different companies using direct and systematic observation of human behavior. It also provides an analysis and discussion of the activity theory literature and how it has been applied in areas such as human‐computer interaction and computer‐supported collaborative work.

Findings

The authors illustrate the relevance of the notion of ensembles for activity theory and suggest some benefits of this conceptualization for analyzing human work in areas such as human‐computer interaction and computer‐supported collaborative work.

Research limitations/implications

The notion of ensembles can be useful for the development of a computing infrastructure oriented to more effectively supporting work activities.

Originality/value

The paper shows that the value of the notion of ensembles is to close a conceptual gulf not adequately addressed in activity theory, and to understand the practical aspects of the instantiation of objects over time.

Details

Information Technology & People, vol. 22 no. 2
Type: Research Article
ISSN: 0959-3845

Keywords

Article
Publication date: 8 February 2021

Wiah Wardiningsih and Olga Troynikov

This paper aims to examine the influence of hip protective clothing on ensemble performance attributes related to thermal comfort. It also explores the effect on protective pads…

Abstract

Purpose

This paper aims to examine the influence of hip protective clothing on ensemble performance attributes related to thermal comfort. It also explores the effect on protective pads of various materials and the arrangements of material. The thermal comfort characteristics are thermal insulation and moisture vapour resistance.

Design/methodology/approach

For this research, four ensembles of clothing were used: one ensemble without hip protective clothing and three ensembles with hip protective clothing. A thermal manikin was used to test the thermal insulation and moisture vapour resistance of the ensembles.

Findings

The findings revealed that incorporating hip protective clothing into the clothing ensembles influenced the thermal resistance and moisture vapour resistance of the ensemble. In the “all zones group,” the influence of the hip protective clothing depended on clothing style, with hipster-style clothing producing insignificant changes. In the “hip zones group” and “stomach and hip zones group,” hip protective clothing strongly influenced the thermal comfort attributes of ensembles. Pad material and volume play important roles in these changes in thermal comfort attributes.

Originality/value

These outcomes are useful for the design and engineering of hip protective clothing, where maximizing protection while minimizing thermal and moisture vapour resistance is critical for wear comfort and adherence in warm or hot conditions. The designer should consider that material, volume and thickness of protective pad affect the overall thermal comfort attributes of the hip protective clothing.

11 – 20 of over 4000