Search results

1 – 10 of over 4000
Article
Publication date: 7 November 2023

Christian Nnaemeka Egwim, Hafiz Alaka, Youlu Pan, Habeeb Balogun, Saheed Ajayi, Abdul Hye and Oluwapelumi Oluwaseun Egunjobi

The study aims to develop a multilayer high-effective ensemble of ensembles predictive model (stacking ensemble) using several hyperparameter optimized ensemble machine learning…

63

Abstract

Purpose

The study aims to develop a multilayer high-effective ensemble of ensembles predictive model (stacking ensemble) using several hyperparameter optimized ensemble machine learning (ML) methods (bagging and boosting ensembles) trained with high-volume data points retrieved from Internet of Things (IoT) emission sensors, time-corresponding meteorology and traffic data.

Design/methodology/approach

For a start, the study experimented big data hypothesis theory by developing sample ensemble predictive models on different data sample sizes and compared their results. Second, it developed a standalone model and several bagging and boosting ensemble models and compared their results. Finally, it used the best performing bagging and boosting predictive models as input estimators to develop a novel multilayer high-effective stacking ensemble predictive model.

Findings

Results proved data size to be one of the main determinants to ensemble ML predictive power. Second, it proved that, as compared to using a single algorithm, the cumulative result from ensemble ML algorithms is usually always better in terms of predicted accuracy. Finally, it proved stacking ensemble to be a better model for predicting PM2.5 concentration level than bagging and boosting ensemble models.

Research limitations/implications

A limitation of this study is the trade-off between performance of this novel model and the computational time required to train it. Whether this gap can be closed remains an open research question. As a result, future research should attempt to close this gap. Also, future studies can integrate this novel model to a personal air quality messaging system to inform public of pollution levels and improve public access to air quality forecast.

Practical implications

The outcome of this study will aid the public to proactively identify highly polluted areas thus potentially reducing pollution-associated/ triggered COVID-19 (and other lung diseases) deaths/ complications/ transmission by encouraging avoidance behavior and support informed decision to lock down by government bodies when integrated into an air pollution monitoring system

Originality/value

This study fills a gap in literature by providing a justification for selecting appropriate ensemble ML algorithms for PM2.5 concentration level predictive modeling. Second, it contributes to the big data hypothesis theory, which suggests that data size is one of the most important factors of ML predictive capability. Third, it supports the premise that when using ensemble ML algorithms, the cumulative output is usually always better in terms of predicted accuracy than using a single algorithm. Finally developing a novel multilayer high-performant hyperparameter optimized ensemble of ensembles predictive model that can accurately predict PM2.5 concentration levels with improved model interpretability and enhanced generalizability, as well as the provision of a novel databank of historic pollution data from IoT emission sensors that can be purchased for research, consultancy and policymaking.

Details

Journal of Engineering, Design and Technology , vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1726-0531

Keywords

Article
Publication date: 15 August 2023

Chunping Zhou, Zheng Wei, Huajin Lei, Fangyun Ma and Wei Li

Surrogate models are extensively used to substitute real models which are expensive to evaluate in the time-dependent reliability analysis. Normally, different surrogate models…

Abstract

Purpose

Surrogate models are extensively used to substitute real models which are expensive to evaluate in the time-dependent reliability analysis. Normally, different surrogate models have different scopes of application. However, information is often insufficient for analysts to select the most appropriate surrogate model for a specific application. Thus, the result precited by individual surrogate model tends to be suboptimal or even inaccurate. Ensemble model can effectively deal with the above concern. This work aims to study the application of ensemble model for reliability analysis of time-independent problems.

Design/methodology/approach

In this work, a method of reliability analysis for time-dependent problems based on ensemble learning of surrogate models is developed. The ensemble of surrogate models includes Kriging, radial basis function, and support vector machine. The prediction is approximated by the weighted average model. The ensemble learning of surrogate models is updated by finding and adding the sample points with large prediction errors throughout the entire procedure.

Findings

The effectiveness of the proposed method is verified by several examples. The results show that the ensemble of surrogate models can effectively propagate the uncertainty of time-varying problems, and evaluate the reliability with high prediction accuracy and computational efficiency.

Originality/value

This work proposes an adaptive learning framework for the uncertainty propagation of time-dependent problems based on the ensemble of surrogate models. Compared with individual surrogate models, the ensemble model not only saves the effort of selecting an appropriate surrogate model especially when the knowledge of unknown problem is lacking, but also improves the prediction accuracy and computational efficiency.

Details

Multidiscipline Modeling in Materials and Structures, vol. 19 no. 6
Type: Research Article
ISSN: 1573-6105

Keywords

Article
Publication date: 21 December 2022

Agya Preet, Arunangshu Mukhopadhyay and Vinay Kumar Midha

Sweating is thermo-regulatory behaviour that occurs when a person performs vigorous activity even in cold climatic condition. One of important component of sweat is the presence…

Abstract

Purpose

Sweating is thermo-regulatory behaviour that occurs when a person performs vigorous activity even in cold climatic condition. One of important component of sweat is the presence of lactate. Based on climatic condition, age, gender, maturity and nature of activity level, the change in lactate concentration is inevitable. Hence, the present study is focussed on the impact of change in the lactate concentration on the moisture transmission behaviour through the clothing. The purpose of this paper is to investigate the impact of changing lactate concentration on the moisture vapour transmission behaviour through multi-layered clothing ensembles.

Design/methodology/approach

For the investigation, sweat solution representing male and female sweat were taken for present study. Two different multi-layered ensembles consisting of either spacer or fleece as middle layer were considered. The water vapour permeability and drying rate test were done at standard atmospheric conditions. After testing, ANOVA analysis was done in order to determine the most significant parameters.

Findings

Fabric structure (constituent layers) behaved differently when tested individually and as the layered component with different sweat solutions. Water vapour permeability of sweat solution with higher lactate concentration was lower as compared to sweat solution with lower lactate concentration. Individual layers showed higher rate of vapour permeability with sweat solution containing lower lactate concentration as compared to multi-layered ensembles. Role of PU coated nylon fabric was predominant in case of multi-layered ensembles. Difference in transmission of sweat solution was found higher in case of uni-directional stitched multi-layer spacer ensembles whereas marginal difference was observed in case of bi-directional seamed multi-layer spacer ensemble. Drying rate of sweat containing lower concentration of lactate was higher as compared to the other sweat solution for all the selected fabrics. Density of liquid and amount of the water available for drying influenced the drying behaviour and thus accounted for difference in drying rate of sweat solution differing in the lactate concentration. The contribution percentage of layers, i.e. type of structure was higher (nearly 93–96%) compared to that of solution type (3.3–4.9%) in case of individual layers whereas in the case of the multi-layer ensembles; type of seam had maximum contribution percentage (71–77%) followed by solution type (10–15%). Type of layers had least contribution percentage (nearly 7–9%).

Practical implications

The findings from the study are expected to be realistic and important in designing and development of cold weather garment ensemble for different gender type depending on their activity level especially in case of military personnel and those performing combat activities.

Originality/value

This experimental work based will provide the insight about the behaviour of actual sweat transmission through the layered fabric ensembles and ways to prevent the accumulation of moisture near to human skin surface by manufacturing suitable design structures (in terms of layering composition and seam patterns) per the morphology and requirement of specific consumers.

Details

International Journal of Clothing Science and Technology, vol. 35 no. 2
Type: Research Article
ISSN: 0955-6222

Keywords

Article
Publication date: 18 October 2022

Hasnae Zerouaoui, Ali Idri and Omar El Alaoui

Hundreds of thousands of deaths each year in the world are caused by breast cancer (BC). An early-stage diagnosis of this disease can positively reduce the morbidity and mortality…

Abstract

Purpose

Hundreds of thousands of deaths each year in the world are caused by breast cancer (BC). An early-stage diagnosis of this disease can positively reduce the morbidity and mortality rate by helping to select the most appropriate treatment options, especially by using histological BC images for the diagnosis.

Design/methodology/approach

The present study proposes and evaluates a novel approach which consists of 24 deep hybrid heterogenous ensembles that combine the strength of seven deep learning techniques (DenseNet 201, Inception V3, VGG16, VGG19, Inception-ResNet-V3, MobileNet V2 and ResNet 50) for feature extraction and four well-known classifiers (multi-layer perceptron, support vector machines, K-nearest neighbors and decision tree) by means of hard and weighted voting combination methods for histological classification of BC medical image. Furthermore, the best deep hybrid heterogenous ensembles were compared to the deep stacked ensembles to determine the best strategy to design the deep ensemble methods. The empirical evaluations used four classification performance criteria (accuracy, sensitivity, precision and F1-score), fivefold cross-validation, Scott–Knott (SK) statistical test and Borda count voting method. All empirical evaluations were assessed using four performance measures, including accuracy, precision, recall and F1-score, and were over the histological BreakHis public dataset with four magnification factors (40×, 100×, 200× and 400×). SK statistical test and Borda count were also used to cluster the designed techniques and rank the techniques belonging to the best SK cluster, respectively.

Findings

Results showed that the deep hybrid heterogenous ensembles outperformed both their singles and the deep stacked ensembles and reached the accuracy values of 96.3, 95.6, 96.3 and 94 per cent across the four magnification factors 40×, 100×, 200× and 400×, respectively.

Originality/value

The proposed deep hybrid heterogenous ensembles can be applied for the BC diagnosis to assist pathologists in reducing the missed diagnoses and proposing adequate treatments for the patients.

Details

Data Technologies and Applications, vol. 57 no. 2
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 27 February 2023

Fatima-Zahrae Nakach, Hasnae Zerouaoui and Ali Idri

Histopathology biopsy imaging is currently the gold standard for the diagnosis of breast cancer in clinical practice. Pathologists examine the images at various magnifications to…

Abstract

Purpose

Histopathology biopsy imaging is currently the gold standard for the diagnosis of breast cancer in clinical practice. Pathologists examine the images at various magnifications to identify the type of tumor because if only one magnification is taken into account, the decision may not be accurate. This study explores the performance of transfer learning and late fusion to construct multi-scale ensembles that fuse different magnification-specific deep learning models for the binary classification of breast tumor slides.

Design/methodology/approach

Three pretrained deep learning techniques (DenseNet 201, MobileNet v2 and Inception v3) were used to classify breast tumor images over the four magnification factors of the Breast Cancer Histopathological Image Classification dataset (40×, 100×, 200× and 400×). To fuse the predictions of the models trained on different magnification factors, different aggregators were used, including weighted voting and seven meta-classifiers trained on slide predictions using class labels and the probabilities assigned to each class. The best cluster of the outperforming models was chosen using the Scott–Knott statistical test, and the top models were ranked using the Borda count voting system.

Findings

This study recommends the use of transfer learning and late fusion for histopathological breast cancer image classification by constructing multi-magnification ensembles because they perform better than models trained on each magnification separately.

Originality/value

The best multi-scale ensembles outperformed state-of-the-art integrated models and achieved an accuracy mean value of 98.82 per cent, precision of 98.46 per cent, recall of 100 per cent and F1-score of 99.20 per cent.

Details

Data Technologies and Applications, vol. 57 no. 5
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 23 November 2022

Ibrahim Karatas and Abdulkadir Budak

The study is aimed to compare the prediction success of basic machine learning and ensemble machine learning models and accordingly create novel prediction models by combining…

Abstract

Purpose

The study is aimed to compare the prediction success of basic machine learning and ensemble machine learning models and accordingly create novel prediction models by combining machine learning models to increase the prediction success in construction labor productivity prediction models.

Design/methodology/approach

Categorical and numerical data used in prediction models in many studies in the literature for the prediction of construction labor productivity were made ready for analysis by preprocessing. The Python programming language was used to develop machine learning models. As a result of many variation trials, the models were combined and the proposed novel voting and stacking meta-ensemble machine learning models were constituted. Finally, the models were compared to Target and Taylor diagram.

Findings

Meta-ensemble models have been developed for labor productivity prediction by combining machine learning models. Voting ensemble by combining et, gbm, xgboost, lightgbm, catboost and mlp models and stacking ensemble by combining et, gbm, xgboost, catboost and mlp models were created and finally the Et model as meta-learner was selected. Considering the prediction success, it has been determined that the voting and stacking meta-ensemble algorithms have higher prediction success than other machine learning algorithms. Model evaluation metrics, namely MAE, MSE, RMSE and R2, were selected to measure the prediction success. For the voting meta-ensemble algorithm, the values of the model evaluation metrics MAE, MSE, RMSE and R2 are 0.0499, 0.0045, 0.0671 and 0.7886, respectively. For the stacking meta-ensemble algorithm, the values of the model evaluation metrics MAE, MSE, RMSE and R2 are 0.0469, 0.0043, 0.0658 and 0.7967, respectively.

Research limitations/implications

The study shows the comparison between machine learning algorithms and created novel meta-ensemble machine learning algorithms to predict the labor productivity of construction formwork activity. The practitioners and project planners can use this model as reliable and accurate tool for predicting the labor productivity of construction formwork activity prior to construction planning.

Originality/value

The study provides insight into the application of ensemble machine learning algorithms in predicting construction labor productivity. Additionally, novel meta-ensemble algorithms have been used and proposed. Therefore, it is hoped that predicting the labor productivity of construction formwork activity with high accuracy will make a great contribution to construction project management.

Details

Engineering, Construction and Architectural Management, vol. 31 no. 3
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 7 April 2015

Jie Sun, Hui Li, Pei-Chann Chang and Qing-Hua Huang

Previous researches on credit scoring mainly focussed on static modeling on panel sample data set in a certain period of time, and did not pay enough attention on dynamic…

Abstract

Purpose

Previous researches on credit scoring mainly focussed on static modeling on panel sample data set in a certain period of time, and did not pay enough attention on dynamic incremental modeling. The purpose of this paper is to address the integration of branch and bound algorithm with incremental support vector machine (SVM) ensemble to make dynamic modeling of credit scoring.

Design/methodology/approach

This new model hybridizes support vectors of old data with incremental financial data of corporate in the process of dynamic ensemble modeling based on bagged SVM. In the incremental stage, multiple base SVM models are dynamically adjusted according to bagged new updated information for credit scoring. These updated base models are further combined to generate a dynamic credit scoring. In the empirical experiment, the new method was compared with the traditional model of non-incremental SVM ensemble for credit scoring.

Findings

The results show that the new model is able to continuously and dynamically adjust credit scoring according to corporate incremental information, which helps produce better evaluation ability than the traditional model.

Originality/value

This research pioneered on dynamic modeling for credit scoring with incremental SVM ensemble. As time pasts, new incremental samples will be combined with support vectors of old samples to construct SVM ensemble credit scoring model. The incremental model will continuously adjust itself to keep good evaluation performance.

Details

Kybernetes, vol. 44 no. 4
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 29 July 2014

Chih-Fong Tsai and Chihli Hung

Credit scoring is important for financial institutions in order to accurately predict the likelihood of business failure. Related studies have shown that machine learning…

1124

Abstract

Purpose

Credit scoring is important for financial institutions in order to accurately predict the likelihood of business failure. Related studies have shown that machine learning techniques, such as neural networks, outperform many statistical approaches to solving this type of problem, and advanced machine learning techniques, such as classifier ensembles and hybrid classifiers, provide better prediction performance than single machine learning based classification techniques. However, it is not known which type of advanced classification technique performs better in terms of financial distress prediction. The paper aims to discuss these issues.

Design/methodology/approach

This paper compares neural network ensembles and hybrid neural networks over three benchmarking credit scoring related data sets, which are Australian, German, and Japanese data sets.

Findings

The experimental results show that hybrid neural networks and neural network ensembles outperform the single neural network. Although hybrid neural networks perform slightly better than neural network ensembles in terms of predication accuracy and errors with two of the data sets, there is no significant difference between the two types of prediction models.

Originality/value

The originality of this paper is in comparing two types of advanced classification techniques, i.e. hybrid and ensemble learning techniques, in terms of financial distress prediction.

Details

Kybernetes, vol. 43 no. 7
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 5 August 2014

Theodoros Anagnostopoulos and Christos Skourlas

The purpose of this paper is to understand the emotional state of a human being by capturing the speech utterances that are used during common conversation. Human beings except of…

Abstract

Purpose

The purpose of this paper is to understand the emotional state of a human being by capturing the speech utterances that are used during common conversation. Human beings except of thinking creatures are also sentimental and emotional organisms. There are six universal basic emotions plus a neutral emotion: happiness, surprise, fear, sadness, anger, disgust and neutral.

Design/methodology/approach

It is proved that, given enough acoustic evidence, the emotional state of a person can be classified by an ensemble majority voting classifier. The proposed ensemble classifier is constructed over three base classifiers: k nearest neighbors, C4.5 and support vector machine (SVM) polynomial kernel.

Findings

The proposed ensemble classifier achieves better performance than each base classifier. It is compared with two other ensemble classifiers: one-against-all (OAA) multiclass SVM with radial basis function kernels and OAA multiclass SVM with hybrid kernels. The proposed ensemble classifier achieves better performance than the other two ensemble classifiers.

Originality/value

The current paper performs emotion classification with an ensemble majority voting classifier that combines three certain types of base classifiers which are of low computational complexity. The base classifiers stem from different theoretical background to avoid bias and redundancy. It gives to the proposed ensemble classifier the ability to be generalized in the emotion domain space.

Details

Journal of Systems and Information Technology, vol. 16 no. 3
Type: Research Article
ISSN: 1328-7265

Keywords

Article
Publication date: 7 July 2020

Jiaming Liu, Liuan Wang, Linan Zhang, Zeming Zhang and Sicheng Zhang

The primary objective of this study was to recognize critical indicators in predicting blood glucose (BG) through data-driven methods and to compare the prediction performance of…

Abstract

Purpose

The primary objective of this study was to recognize critical indicators in predicting blood glucose (BG) through data-driven methods and to compare the prediction performance of four tree-based ensemble models, i.e. bagging with tree regressors (bagging-decision tree [Bagging-DT]), AdaBoost with tree regressors (Adaboost-DT), random forest (RF) and gradient boosting decision tree (GBDT).

Design/methodology/approach

This study proposed a majority voting feature selection method by combining lasso regression with the Akaike information criterion (AIC) (LR-AIC), lasso regression with the Bayesian information criterion (BIC) (LR-BIC) and RF to select indicators with excellent predictive performance from initial 38 indicators in 5,642 samples. The selected features were deployed to build the tree-based ensemble models. The 10-fold cross-validation (CV) method was used to evaluate the performance of each ensemble model.

Findings

The results of feature selection indicated that age, corpuscular hemoglobin concentration (CHC), red blood cell volume distribution width (RBCVDW), red blood cell volume and leucocyte count are five most important clinical/physical indicators in BG prediction. Furthermore, this study also found that the GBDT ensemble model combined with the proposed majority voting feature selection method is better than other three models with respect to prediction performance and stability.

Practical implications

This study proposed a novel BG prediction framework for better predictive analytics in health care.

Social implications

This study incorporated medical background and machine learning technology to reduce diabetes morbidity and formulate precise medical schemes.

Originality/value

The majority voting feature selection method combined with the GBDT ensemble model provides an effective decision-making tool for predicting BG and detecting diabetes risk in advance.

1 – 10 of over 4000