Search results

1 – 10 of over 1000
Article
Publication date: 7 November 2023

Christian Nnaemeka Egwim, Hafiz Alaka, Youlu Pan, Habeeb Balogun, Saheed Ajayi, Abdul Hye and Oluwapelumi Oluwaseun Egunjobi

The study aims to develop a multilayer high-effective ensemble of ensembles predictive model (stacking ensemble) using several hyperparameter optimized ensemble machine learning…

66

Abstract

Purpose

The study aims to develop a multilayer high-effective ensemble of ensembles predictive model (stacking ensemble) using several hyperparameter optimized ensemble machine learning (ML) methods (bagging and boosting ensembles) trained with high-volume data points retrieved from Internet of Things (IoT) emission sensors, time-corresponding meteorology and traffic data.

Design/methodology/approach

For a start, the study experimented big data hypothesis theory by developing sample ensemble predictive models on different data sample sizes and compared their results. Second, it developed a standalone model and several bagging and boosting ensemble models and compared their results. Finally, it used the best performing bagging and boosting predictive models as input estimators to develop a novel multilayer high-effective stacking ensemble predictive model.

Findings

Results proved data size to be one of the main determinants to ensemble ML predictive power. Second, it proved that, as compared to using a single algorithm, the cumulative result from ensemble ML algorithms is usually always better in terms of predicted accuracy. Finally, it proved stacking ensemble to be a better model for predicting PM2.5 concentration level than bagging and boosting ensemble models.

Research limitations/implications

A limitation of this study is the trade-off between performance of this novel model and the computational time required to train it. Whether this gap can be closed remains an open research question. As a result, future research should attempt to close this gap. Also, future studies can integrate this novel model to a personal air quality messaging system to inform public of pollution levels and improve public access to air quality forecast.

Practical implications

The outcome of this study will aid the public to proactively identify highly polluted areas thus potentially reducing pollution-associated/ triggered COVID-19 (and other lung diseases) deaths/ complications/ transmission by encouraging avoidance behavior and support informed decision to lock down by government bodies when integrated into an air pollution monitoring system

Originality/value

This study fills a gap in literature by providing a justification for selecting appropriate ensemble ML algorithms for PM2.5 concentration level predictive modeling. Second, it contributes to the big data hypothesis theory, which suggests that data size is one of the most important factors of ML predictive capability. Third, it supports the premise that when using ensemble ML algorithms, the cumulative output is usually always better in terms of predicted accuracy than using a single algorithm. Finally developing a novel multilayer high-performant hyperparameter optimized ensemble of ensembles predictive model that can accurately predict PM2.5 concentration levels with improved model interpretability and enhanced generalizability, as well as the provision of a novel databank of historic pollution data from IoT emission sensors that can be purchased for research, consultancy and policymaking.

Details

Journal of Engineering, Design and Technology , vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1726-0531

Keywords

Article
Publication date: 11 July 2023

Abhinandan Chatterjee, Pradip Bala, Shruti Gedam, Sanchita Paul and Nishant Goyal

Depression is a mental health problem characterized by a persistent sense of sadness and loss of interest. EEG signals are regarded as the most appropriate instruments for…

Abstract

Purpose

Depression is a mental health problem characterized by a persistent sense of sadness and loss of interest. EEG signals are regarded as the most appropriate instruments for diagnosing depression because they reflect the operating status of the human brain. The purpose of this study is the early detection of depression among people using EEG signals.

Design/methodology/approach

(i) Artifacts are removed by filtering and linear and non-linear features are extracted; (ii) feature scaling is done using a standard scalar while principal component analysis (PCA) is used for feature reduction; (iii) the linear, non-linear and combination of both (only for those whose accuracy is highest) are taken for further analysis where some ML and DL classifiers are applied for the classification of depression; and (iv) in this study, total 15 distinct ML and DL methods, including KNN, SVM, bagging SVM, RF, GB, Extreme Gradient Boosting, MNB, Adaboost, Bagging RF, BootAgg, Gaussian NB, RNN, 1DCNN, RBFNN and LSTM, that have been effectively utilized as classifiers to handle a variety of real-world issues.

Findings

1. Among all, alpha, alpha asymmetry, gamma and gamma asymmetry give the best results in linear features, while RWE, DFA, CD and AE give the best results in non-linear feature. 2. In the linear features, gamma and alpha asymmetry have given 99.98% accuracy for Bagging RF, while gamma asymmetry has given 99.98% accuracy for BootAgg. 3. For non-linear features, it has been shown 99.84% of accuracy for RWE and DFA in RF, 99.97% accuracy for DFA in XGBoost and 99.94% accuracy for RWE in BootAgg. 4. By using DL, in linear features, gamma asymmetry has given more than 96% accuracy in RNN and 91% accuracy in LSTM and for non-linear features, 89% accuracy has been achieved for CD and AE in LSTM. 5. By combining linear and non-linear features, the highest accuracy was achieved in Bagging RF (98.50%) gamma asymmetry + RWE. In DL, Alpha + RWE, Gamma asymmetry + CD and gamma asymmetry + RWE have achieved 98% accuracy in LSTM.

Originality/value

A novel dataset was collected from the Central Institute of Psychiatry (CIP), Ranchi which was recorded using a 128-channels whereas major previous studies used fewer channels; the details of the study participants are summarized and a model is developed for statistical analysis using N-way ANOVA; artifacts are removed by high and low pass filtering of epoch data followed by re-referencing and independent component analysis for noise removal; linear features, namely, band power and interhemispheric asymmetry and non-linear features, namely, relative wavelet energy, wavelet entropy, Approximate entropy, sample entropy, detrended fluctuation analysis and correlation dimension are extracted; this model utilizes Epoch (213,072) for 5 s EEG data, which allows the model to train for longer, thereby increasing the efficiency of classifiers. Features scaling is done using a standard scalar rather than normalization because it helps increase the accuracy of the models (especially for deep learning algorithms) while PCA is used for feature reduction; the linear, non-linear and combination of both features are taken for extensive analysis in conjunction with ML and DL classifiers for the classification of depression. The combination of linear and non-linear features (only for those whose accuracy is highest) is used for the best detection results.

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 29 September 2023

Wen-Qian Lou, Bin Wu and Bo-Wen Zhu

This study aims to clarify influencing factors of overcapacity of new energy enterprises in China and accurately predict whether these enterprises have overcapacity.

74

Abstract

Purpose

This study aims to clarify influencing factors of overcapacity of new energy enterprises in China and accurately predict whether these enterprises have overcapacity.

Design/methodology/approach

Based on relevant data including the experience and evidence from the capital market in China, the research establishes a generic univariate selection-comparative machine learning model to study relevant factors that affect overcapacity of new energy enterprises from five dimensions. These include the governmental intervention, market demand, corporate finance, corporate governance and corporate decision. Moreover, the bridging approach is used to strengthen findings from quantitative studies via the results from qualitative studies.

Findings

The authors' results show that the overcapacity of new energy enterprises in China is brought out by the combined effect of governmental intervention corporate governance and corporate decision. Governmental interventions increase the overcapacity risk of new energy enterprises mainly by distorting investment behaviors of enterprises. Corporate decision and corporate governance factors affect the overcapacity mainly by regulating the degree of overconfidence of the management team and the agency cost. Among the eight comparable integrated models, generic univariate selection-bagging exhibits the optimal comprehensive generalization performance and its area under the receiver operating characteristic curve Area under curve (AUC) accuracy precision and recall are 0.719, 0.960, 0.975 and 0.983, respectively.

Originality/value

The proposed integrated model analyzes causes and predicts presence of overcapacity of new energy enterprises to help governments to formulate appropriate strategies to deal with overcapacity and new energy enterprises to optimize resource allocation. Ten main features which affect the overcapacity of new energy enterprises in China are identified through generic univariate selection model. Through the bridging approach, the impact of the main features on the overcapacity of new energy enterprises and the mechanism of the influence are analyzed.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 10 October 2023

Fatma Bakal Gumus and Ahmet Yapici

The purpose of this paper is to investigate the effect of doping element on the structural, thermal properties, mechanical performance and the failure mechanism of hexagonal nano…

Abstract

Purpose

The purpose of this paper is to investigate the effect of doping element on the structural, thermal properties, mechanical performance and the failure mechanism of hexagonal nano boron nitride (h-BN)-reinforced basalt fabric (BF)/epoxy composites produced by hand lay-up and vacuum bagging technique. h-BN particles doped to composite materials increased the tensile, bending and impact strength of the composite at certain rates while 1 Wt. % h- BN addition shows the highest tensile and flexural strength.

Design/methodology/approach

The epoxy resin was doped with h-BN nanopowder at the certain rates (0, 1, 2 and 4 Wt.%) and the epoxy: hardener ratios used in the study were selected as 80%: 20% by weight. Then, with the aid of a roller by hand lay-up method, a mixture of epoxy + hardeners containing nanoparticles and nanoparticle-free were fed onto BFs, 12 layers of each dimension 30 cm × 30 cm. The surplus epoxy resin was moved away from the composite sheets using the vacuum bagging process and left to cure at room temperature for 24 h. ASTM D3039 for tensile, D7264 for three-point bending and D256 for Izod impact test were performed for the mechanical tests. After the tensile test, the morphologies of the fracture surface were examined with a stereomicroscope and various failure mechanisms are highlighted.

Findings

In this study, a series of basalt/epoxy composites with h-BN nanopowders have been prepared to identify the effect of filler ratio on mechanical properties. It has been known from the results of mechanical experiments that the addition of h-BN improves the mechanical performance of materials at a certain rate. The tensile and flexural strengths of h-BN doped composites, increase for concentrations of 1 Wt.% h-BN, but decrease with the increasing content of it. The basalt/epoxy resin composite with higher mechanical properties could be a potential material in the automotive and aerospace industries.

Originality/value

The aim of this study is to contribute to literature within the context of this new combination of composites and their mechanical properties, failure mechanisms. It presents detailed characterization of each composite by using X-ray differaction (XRD), differential scanning calorimetry (DSC), fourier transform infrared spectroscopy (FT-IR) and scanning electron microscopy.

Details

Aircraft Engineering and Aerospace Technology, vol. 95 no. 10
Type: Research Article
ISSN: 1748-8842

Keywords

Article
Publication date: 3 November 2023

Vimala Balakrishnan, Aainaa Nadia Mohammed Hashim, Voon Chung Lee, Voon Hee Lee and Ying Qiu Lee

This study aims to develop a machine learning model to detect structure fire fatalities using a dataset comprising 11,341 cases from 2011 to 2019.

31

Abstract

Purpose

This study aims to develop a machine learning model to detect structure fire fatalities using a dataset comprising 11,341 cases from 2011 to 2019.

Design/methodology/approach

Exploratory data analysis (EDA) was conducted prior to modelling, in which ten machine learning models were experimented with.

Findings

The main fatal structure fire risk factors were fires originating from bedrooms, living areas and the cooking/dining areas. The highest fatality rate (20.69%) was reported for fires ignited due to bedding (23.43%), despite a low fire incident rate (3.50%). Using 21 structure fire features, Random Forest (RF) yielded the best detection performance with 86% accuracy, followed by Decision Tree (DT) with bagging (accuracy = 84.7%).

Research limitations/practical implications

Limitations of the study are pertaining to data quality and grouping of categories in the data pre-processing stage, which could affect the performance of the models.

Originality/value

The study is the first of its kind to manipulate risk factors to detect fatal structure classification, particularly focussing on structure fire fatalities. Most of the previous studies examined the importance of fire risk factors and their relationship to the fire risk level.

Details

International Journal of Intelligent Computing and Cybernetics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 15 March 2023

Indranil Ghosh, Rabin K. Jana and Mohammad Zoynul Abedin

The prediction of Airbnb listing prices predominantly uses a set of amenity-driven features. Choosing an appropriate set of features from thousands of available amenity-driven…

Abstract

Purpose

The prediction of Airbnb listing prices predominantly uses a set of amenity-driven features. Choosing an appropriate set of features from thousands of available amenity-driven features makes the prediction task difficult. This paper aims to propose a scalable, robust framework to predict listing prices of Airbnb units without using amenity-driven features.

Design/methodology/approach

The authors propose an artificial intelligence (AI)-based framework to predict Airbnb listing prices. The authors consider 75 thousand Airbnb listings from the five US cities with more than 1.9 million observations. The proposed framework integrates (i) feature screening, (ii) stacking that combines gradient boosting, bagging, random forest, (iii) particle swarm optimization and (iv) explainable AI to accomplish the research objective.

Findings

The key findings have three aspects – prediction accuracy, homogeneity and identification of best and least predictable cities. The proposed framework yields predictions of supreme precision. The predictability of listing prices varies significantly across cities. The listing prices are the best predictable for Boston and the least predictable for Chicago.

Practical implications

The framework and findings of the research can be leveraged by the hosts to determine rental prices and augment the service offerings by emphasizing key features, respectively.

Originality/value

Although individual components are known, the way they have been integrated into the proposed framework to derive a high-quality forecast of Airbnb listing prices is unique. It is scalable. The Airbnb listing price modeling literature rarely witnesses such a framework.

Details

International Journal of Contemporary Hospitality Management, vol. 35 no. 10
Type: Research Article
ISSN: 0959-6119

Keywords

Article
Publication date: 5 May 2023

Nguyen Thi Dinh, Nguyen Thi Uyen Nhi, Thanh Manh Le and Thanh The Van

The problem of image retrieval and image description exists in various fields. In this paper, a model of content-based image retrieval and image content extraction based on the…

Abstract

Purpose

The problem of image retrieval and image description exists in various fields. In this paper, a model of content-based image retrieval and image content extraction based on the KD-Tree structure was proposed.

Design/methodology/approach

A Random Forest structure was built to classify the objects on each image on the basis of the balanced multibranch KD-Tree structure. From that purpose, a KD-Tree structure was generated by the Random Forest to retrieve a set of similar images for an input image. A KD-Tree structure is applied to determine a relationship word at leaves to extract the relationship between objects on an input image. An input image content is described based on class names and relationships between objects.

Findings

A model of image retrieval and image content extraction was proposed based on the proposed theoretical basis; simultaneously, the experiment was built on multi-object image datasets including Microsoft COCO and Flickr with an average image retrieval precision of 0.9028 and 0.9163, respectively. The experimental results were compared with those of other works on the same image dataset to demonstrate the effectiveness of the proposed method.

Originality/value

A balanced multibranch KD-Tree structure was built to apply to relationship classification on the basis of the original KD-Tree structure. Then, KD-Tree Random Forest was built to improve the classifier performance and retrieve a set of similar images for an input image. Concurrently, the image content was described in the process of combining class names and relationships between objects.

Details

Data Technologies and Applications, vol. 57 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 27 February 2023

Fatima-Zahrae Nakach, Hasnae Zerouaoui and Ali Idri

Histopathology biopsy imaging is currently the gold standard for the diagnosis of breast cancer in clinical practice. Pathologists examine the images at various magnifications to…

Abstract

Purpose

Histopathology biopsy imaging is currently the gold standard for the diagnosis of breast cancer in clinical practice. Pathologists examine the images at various magnifications to identify the type of tumor because if only one magnification is taken into account, the decision may not be accurate. This study explores the performance of transfer learning and late fusion to construct multi-scale ensembles that fuse different magnification-specific deep learning models for the binary classification of breast tumor slides.

Design/methodology/approach

Three pretrained deep learning techniques (DenseNet 201, MobileNet v2 and Inception v3) were used to classify breast tumor images over the four magnification factors of the Breast Cancer Histopathological Image Classification dataset (40×, 100×, 200× and 400×). To fuse the predictions of the models trained on different magnification factors, different aggregators were used, including weighted voting and seven meta-classifiers trained on slide predictions using class labels and the probabilities assigned to each class. The best cluster of the outperforming models was chosen using the Scott–Knott statistical test, and the top models were ranked using the Borda count voting system.

Findings

This study recommends the use of transfer learning and late fusion for histopathological breast cancer image classification by constructing multi-magnification ensembles because they perform better than models trained on each magnification separately.

Originality/value

The best multi-scale ensembles outperformed state-of-the-art integrated models and achieved an accuracy mean value of 98.82 per cent, precision of 98.46 per cent, recall of 100 per cent and F1-score of 99.20 per cent.

Details

Data Technologies and Applications, vol. 57 no. 5
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 20 March 2024

Ziming Zhou, Fengnian Zhao and David Hung

Higher energy conversion efficiency of internal combustion engine can be achieved with optimal control of unsteady in-cylinder flow fields inside a direct-injection (DI) engine…

Abstract

Purpose

Higher energy conversion efficiency of internal combustion engine can be achieved with optimal control of unsteady in-cylinder flow fields inside a direct-injection (DI) engine. However, it remains a daunting task to predict the nonlinear and transient in-cylinder flow motion because they are highly complex which change both in space and time. Recently, machine learning methods have demonstrated great promises to infer relatively simple temporal flow field development. This paper aims to feature a physics-guided machine learning approach to realize high accuracy and generalization prediction for complex swirl-induced flow field motions.

Design/methodology/approach

To achieve high-fidelity time-series prediction of unsteady engine flow fields, this work features an automated machine learning framework with the following objectives: (1) The spatiotemporal physical constraint of the flow field structure is transferred to machine learning structure. (2) The ML inputs and targets are efficiently designed that ensure high model convergence with limited sets of experiments. (3) The prediction results are optimized by ensemble learning mechanism within the automated machine learning framework.

Findings

The proposed data-driven framework is proven effective in different time periods and different extent of unsteadiness of the flow dynamics, and the predicted flow fields are highly similar to the target field under various complex flow patterns. Among the described framework designs, the utilization of spatial flow field structure is the featured improvement to the time-series flow field prediction process.

Originality/value

The proposed flow field prediction framework could be generalized to different crank angle periods, cycles and swirl ratio conditions, which could greatly promote real-time flow control and reduce experiments on in-cylinder flow field measurement and diagnostics.

Details

International Journal of Numerical Methods for Heat & Fluid Flow, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0961-5539

Keywords

Article
Publication date: 2 January 2024

Xinyang Liu, Anyu Liu, Xiaoying Jiao and Zhen Liu

The purpose of the study is to investigate the impact of implementing anti-dumping duties on imported Australian wine to China in the short- and long-run, respectively.

219

Abstract

Purpose

The purpose of the study is to investigate the impact of implementing anti-dumping duties on imported Australian wine to China in the short- and long-run, respectively.

Design/methodology/approach

First, the Difference-in-Differences (DID) method is used in this study to evaluate the short-run causal effect of implementing anti-dumping duties on imported Australian wine to China. Second, a Bayesian ensemble method is used to predict 2023–2025 wine exports from Australia to China. The disparity between the forecasts and counterfactual prediction which assumes no anti-dumping duties represents the accumulated impact of the anti-dumping duties in the long run.

Findings

The anti-dumping duties resulted in a significant decline in red and rose, white and sparkling wine exports to China by 92.59%, 99.06% and 90.06%, respectively, in 2021. In the long run, wine exports to China are projected to continue this downward trend, with an average annual growth rate of −21.92%, −38.90% and −9.54% for the three types of wine, respectively. In contrast, the counterfactual prediction indicates an increase of 3.20%, 20.37% and 4.55% for the respective categories. Consequently, the policy intervention is expected to result in a decrease of 96.11%, 93.15% and 84.11% in red and rose, white and sparkling wine exports to China from 2021 to 2025.

Originality/value

The originality of this study lies in the creation of an economic paradigm for assessing policy impacts within the realm of wine economics. Methodologically, it also represents the pioneering application of the DID and Bayesian ensemble forecasting methods within the field of wine economics.

Details

International Journal of Contemporary Hospitality Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0959-6119

Keywords

1 – 10 of over 1000