Search results

1 – 10 of over 2000
Article
Publication date: 29 July 2014

Chih-Fong Tsai and Chihli Hung

Credit scoring is important for financial institutions in order to accurately predict the likelihood of business failure. Related studies have shown that machine learning…

1135

Abstract

Purpose

Credit scoring is important for financial institutions in order to accurately predict the likelihood of business failure. Related studies have shown that machine learning techniques, such as neural networks, outperform many statistical approaches to solving this type of problem, and advanced machine learning techniques, such as classifier ensembles and hybrid classifiers, provide better prediction performance than single machine learning based classification techniques. However, it is not known which type of advanced classification technique performs better in terms of financial distress prediction. The paper aims to discuss these issues.

Design/methodology/approach

This paper compares neural network ensembles and hybrid neural networks over three benchmarking credit scoring related data sets, which are Australian, German, and Japanese data sets.

Findings

The experimental results show that hybrid neural networks and neural network ensembles outperform the single neural network. Although hybrid neural networks perform slightly better than neural network ensembles in terms of predication accuracy and errors with two of the data sets, there is no significant difference between the two types of prediction models.

Originality/value

The originality of this paper is in comparing two types of advanced classification techniques, i.e. hybrid and ensemble learning techniques, in terms of financial distress prediction.

Details

Kybernetes, vol. 43 no. 7
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 18 October 2022

Hasnae Zerouaoui, Ali Idri and Omar El Alaoui

Hundreds of thousands of deaths each year in the world are caused by breast cancer (BC). An early-stage diagnosis of this disease can positively reduce the morbidity and mortality…

Abstract

Purpose

Hundreds of thousands of deaths each year in the world are caused by breast cancer (BC). An early-stage diagnosis of this disease can positively reduce the morbidity and mortality rate by helping to select the most appropriate treatment options, especially by using histological BC images for the diagnosis.

Design/methodology/approach

The present study proposes and evaluates a novel approach which consists of 24 deep hybrid heterogenous ensembles that combine the strength of seven deep learning techniques (DenseNet 201, Inception V3, VGG16, VGG19, Inception-ResNet-V3, MobileNet V2 and ResNet 50) for feature extraction and four well-known classifiers (multi-layer perceptron, support vector machines, K-nearest neighbors and decision tree) by means of hard and weighted voting combination methods for histological classification of BC medical image. Furthermore, the best deep hybrid heterogenous ensembles were compared to the deep stacked ensembles to determine the best strategy to design the deep ensemble methods. The empirical evaluations used four classification performance criteria (accuracy, sensitivity, precision and F1-score), fivefold cross-validation, Scott–Knott (SK) statistical test and Borda count voting method. All empirical evaluations were assessed using four performance measures, including accuracy, precision, recall and F1-score, and were over the histological BreakHis public dataset with four magnification factors (40×, 100×, 200× and 400×). SK statistical test and Borda count were also used to cluster the designed techniques and rank the techniques belonging to the best SK cluster, respectively.

Findings

Results showed that the deep hybrid heterogenous ensembles outperformed both their singles and the deep stacked ensembles and reached the accuracy values of 96.3, 95.6, 96.3 and 94 per cent across the four magnification factors 40×, 100×, 200× and 400×, respectively.

Originality/value

The proposed deep hybrid heterogenous ensembles can be applied for the BC diagnosis to assist pathologists in reducing the missed diagnoses and proposing adequate treatments for the patients.

Details

Data Technologies and Applications, vol. 57 no. 2
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 27 May 2021

Sara Tavassoli and Hamidreza Koosha

Customer churn prediction is one of the most well-known approaches to manage and improve customer retention. Machine learning techniques, especially classification algorithms, are…

Abstract

Purpose

Customer churn prediction is one of the most well-known approaches to manage and improve customer retention. Machine learning techniques, especially classification algorithms, are very popular tools to predict the churners. In this paper, three ensemble classifiers are proposed based on bagging and boosting for customer churn prediction.

Design/methodology/approach

In this paper, three ensemble classifiers are proposed based on bagging and boosting for customer churn prediction. The first classifier, which is called boosted bagging, uses boosting for each bagging sample. In this approach, before concluding the final results in a bagging algorithm, the authors try to improve the prediction by applying a boosting algorithm for each bootstrap sample. The second proposed ensemble classifier, which is called bagged bagging, combines bagging with itself. In the other words, the authors apply bagging for each sample of bagging algorithm. Finally, the third approach uses bagging of neural network with learning based on a genetic algorithm.

Findings

To examine the performance of all proposed ensemble classifiers, they are applied to two datasets. Numerical simulations illustrate that the proposed hybrid approaches outperform the simple bagging and boosting algorithms as well as base classifiers. Especially, bagged bagging provides high accuracy and precision results.

Originality/value

In this paper, three novel ensemble classifiers are proposed based on bagging and boosting for customer churn prediction. Not only the proposed approaches can be applied for customer churn prediction but also can be used for any other binary classification algorithms.

Details

Kybernetes, vol. 51 no. 3
Type: Research Article
ISSN: 0368-492X

Keywords

Book part
Publication date: 24 March 2006

Valeriy V. Gavrishchaka

Increasing availability of the financial data has opened new opportunities for quantitative modeling. It has also exposed limitations of the existing frameworks, such as low…

Abstract

Increasing availability of the financial data has opened new opportunities for quantitative modeling. It has also exposed limitations of the existing frameworks, such as low accuracy of the simplified analytical models and insufficient interpretability and stability of the adaptive data-driven algorithms. I make the case that boosting (a novel, ensemble learning technique) can serve as a simple and robust framework for combining the best features of the analytical and data-driven models. Boosting-based frameworks for typical financial and econometric applications are outlined. The implementation of a standard boosting procedure is illustrated in the context of the problem of symbolic volatility forecasting for IBM stock time series. It is shown that the boosted collection of the generalized autoregressive conditional heteroskedastic (GARCH)-type models is systematically more accurate than both the best single model in the collection and the widely used GARCH(1,1) model.

Details

Econometric Analysis of Financial and Economic Time Series
Type: Book
ISBN: 978-1-84950-388-4

Article
Publication date: 10 January 2020

Waqar Ahmed Khan, S.H. Chung, Muhammad Usman Awan and Xin Wen

The purpose of this paper is three-fold: to review the categories explaining mainly optimization algorithms (techniques) in that needed to improve the generalization performance…

Abstract

Purpose

The purpose of this paper is three-fold: to review the categories explaining mainly optimization algorithms (techniques) in that needed to improve the generalization performance and learning speed of the Feedforward Neural Network (FNN); to discover the change in research trends by analyzing all six categories (i.e. gradient learning algorithms for network training, gradient free learning algorithms, optimization algorithms for learning rate, bias and variance (underfitting and overfitting) minimization algorithms, constructive topology neural networks, metaheuristic search algorithms) collectively; and recommend new research directions for researchers and facilitate users to understand algorithms real-world applications in solving complex management, engineering and health sciences problems.

Design/methodology/approach

The FNN has gained much attention from researchers to make a more informed decision in the last few decades. The literature survey is focused on the learning algorithms and the optimization techniques proposed in the last three decades. This paper (Part II) is an extension of Part I. For the sake of simplicity, the paper entitled “Machine learning facilitated business intelligence (Part I): Neural networks learning algorithms and applications” is referred to as Part I. To make the study consistent with Part I, the approach and survey methodology in this paper are kept similar to those in Part I.

Findings

Combining the work performed in Part I, the authors studied a total of 80 articles through popular keywords searching. The FNN learning algorithms and optimization techniques identified in the selected literature are classified into six categories based on their problem identification, mathematical model, technical reasoning and proposed solution. Previously, in Part I, the two categories focusing on the learning algorithms (i.e. gradient learning algorithms for network training, gradient free learning algorithms) are reviewed with their real-world applications in management, engineering, and health sciences. Therefore, in the current paper, Part II, the remaining four categories, exploring optimization techniques (i.e. optimization algorithms for learning rate, bias and variance (underfitting and overfitting) minimization algorithms, constructive topology neural networks, metaheuristic search algorithms) are studied in detail. The algorithm explanation is made enriched by discussing their technical merits, limitations, and applications in their respective categories. Finally, the authors recommend future new research directions which can contribute to strengthening the literature.

Research limitations/implications

The FNN contributions are rapidly increasing because of its ability to make reliably informed decisions. Like learning algorithms, reviewed in Part I, the focus is to enrich the comprehensive study by reviewing remaining categories focusing on the optimization techniques. However, future efforts may be needed to incorporate other algorithms into identified six categories or suggest new category to continuously monitor the shift in the research trends.

Practical implications

The authors studied the shift in research trend for three decades by collectively analyzing the learning algorithms and optimization techniques with their applications. This may help researchers to identify future research gaps to improve the generalization performance and learning speed, and user to understand the applications areas of the FNN. For instance, research contribution in FNN in the last three decades has changed from complex gradient-based algorithms to gradient free algorithms, trial and error hidden units fixed topology approach to cascade topology, hyperparameters initial guess to analytically calculation and converging algorithms at a global minimum rather than the local minimum.

Originality/value

The existing literature surveys include comparative study of the algorithms, identifying algorithms application areas and focusing on specific techniques in that it may not be able to identify algorithms categories, a shift in research trends over time, application area frequently analyzed, common research gaps and collective future directions. Part I and II attempts to overcome the existing literature surveys limitations by classifying articles into six categories covering a wide range of algorithm proposed to improve the FNN generalization performance and convergence rate. The classification of algorithms into six categories helps to analyze the shift in research trend which makes the classification scheme significant and innovative.

Details

Industrial Management & Data Systems, vol. 120 no. 1
Type: Research Article
ISSN: 0263-5577

Keywords

Open Access
Article
Publication date: 13 August 2020

Mariam AlKandari and Imtiaz Ahmad

Solar power forecasting will have a significant impact on the future of large-scale renewable energy plants. Predicting photovoltaic power generation depends heavily on climate…

10433

Abstract

Solar power forecasting will have a significant impact on the future of large-scale renewable energy plants. Predicting photovoltaic power generation depends heavily on climate conditions, which fluctuate over time. In this research, we propose a hybrid model that combines machine-learning methods with Theta statistical method for more accurate prediction of future solar power generation from renewable energy plants. The machine learning models include long short-term memory (LSTM), gate recurrent unit (GRU), AutoEncoder LSTM (Auto-LSTM) and a newly proposed Auto-GRU. To enhance the accuracy of the proposed Machine learning and Statistical Hybrid Model (MLSHM), we employ two diversity techniques, i.e. structural diversity and data diversity. To combine the prediction of the ensemble members in the proposed MLSHM, we exploit four combining methods: simple averaging approach, weighted averaging using linear approach and using non-linear approach, and combination through variance using inverse approach. The proposed MLSHM scheme was validated on two real-time series datasets, that sre Shagaya in Kuwait and Cocoa in the USA. The experiments show that the proposed MLSHM, using all the combination methods, achieved higher accuracy compared to the prediction of the traditional individual models. Results demonstrate that a hybrid model combining machine-learning methods with statistical method outperformed a hybrid model that only combines machine-learning models without statistical method.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 2 January 2024

Raunaque Mujeeb Quaiser and Praveen Ranjan Srivastava

This research aims to identify the key factors affecting Outbound Open Innovation between Startups and Big organizations using the multiple criteria decision-making analysis…

Abstract

Purpose

This research aims to identify the key factors affecting Outbound Open Innovation between Startups and Big organizations using the multiple criteria decision-making analysis (MCDM) approach. The MCDM technique ranks the four key factors identified from the literature study that can help to improve collaboration opportunities with Startups.

Design/methodology/approach

Identification of key factors affecting Outbound Open Innovation between Startups and big organizations based on extant literature. A questionnaire is prepared based on these four identified key factors to gather views of the startup's employees, from the designer level to the startup's founder. MCDM techniques are used to evaluate the questionnaire. The ensemble technique is used to rank the key factors coming from three different MCDM methods.

Findings

The findings from the MCDM approach and Ensemble techniques give insight to the big organizations to facilitate outbound Open Innovation effectively. It also provides insight into the requirements of the startups and the kind of support they seek from the big organizations. The ranking can help the big organization close the gaps and make an informed decision to increase the effectiveness of the collaborations and boost innovation.

Originality/value

This is a unique research work where the MCDM approach is used to identify the ranking of key factors affecting outbound open innovation between startups and big organizations. The MCDM technique is followed by the ensemble method to rationalize the findings. Technology Relevance ranks highest, followed by Innovation Ecosystem, Organization commitment and Knowledge Sharing.

Details

Management Decision, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0025-1747

Keywords

Article
Publication date: 29 July 2014

Hsu-Che Wu, Ya-Han Hu and Yen-Hao Huang

Credit ratings have become one of the primary references for financial institutions to assess credit risk. Conventional credit rating approaches mainly concentrated on two-class…

1007

Abstract

Purpose

Credit ratings have become one of the primary references for financial institutions to assess credit risk. Conventional credit rating approaches mainly concentrated on two-class classification (i.e. good or bad credit), which lacks adequate precision to perform credit risk evaluations in practice. In addition, most of previous researches directly focussed on employing various data mining techniques, but rare studies discussed the influence of data preprocessing before classifier construction. The paper aims to discuss these issues.

Design/methodology/approach

This study considers nine-class classification (i.e. nine credit risk level) to credit rating prediction. For the development of more accurate classifiers, the paper adopts two-stage analysis, which integrates multiple data preprocessing and supervised learning techniques. Specifically, the first stage applies feature selection, data clustering, and data resampling methods to preprocess the data, and then the second stage utilizes several classification techniques and classifier ensembles to construct prediction models.

Findings

The results show that Bagging-DT with data resampling method achieves excellent accuracy (82.96 percent), indicating that the proposed two-stage prediction model is better than conventional one-stage models.

Originality/value

Practical implication of this study can lower credit rating expenses and also allow corporations to gain credit rating information instantly.

Details

Kybernetes, vol. 43 no. 7
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 5 July 2023

Maan Habib, Bashar Bashir, Abdullah Alsalman and Hussein Bachir

Slope stability analysis is essential for ensuring the safe design of road embankments. While various conventional methods, such as the finite element approach, are used to…

Abstract

Purpose

Slope stability analysis is essential for ensuring the safe design of road embankments. While various conventional methods, such as the finite element approach, are used to determine the safety factor of road embankments, there is ongoing interest in exploring the potential of machine learning techniques for this purpose.

Design/methodology/approach

Within the study context, the outcomes of the ensemble machine learning models will be compared and benchmarked against the conventional techniques used to predict this parameter.

Findings

Generally, the study results have shown that the proposed machine learning models provide rapid and accurate estimates of the safety factor of road embankments and are, therefore, promising alternatives to traditional methods.

Originality/value

Although machine learning algorithms hold promise for rapidly and accurately estimating the safety factor of road embankments, few studies have systematically compared their performance with traditional methods. To address this gap, this study introduces a novel approach using advanced ensemble machine learning techniques for efficient and precise estimation of the road embankment safety factor. Besides, the study comprehensively assesses the performance of these ensemble techniques, in contrast with established methods such as the finite element approach and empirical models, demonstrating their potential as robust and reliable alternatives in the realm of slope stability assessment.

Details

Multidiscipline Modeling in Materials and Structures, vol. 19 no. 5
Type: Research Article
ISSN: 1573-6105

Keywords

Article
Publication date: 23 October 2020

Brunno e Souza Rodrigues, Carla Martins Floriano, Valdecy Pereira and Marcos Costa Roboredo

This paper presents an algorithm that can elicitate all or any combination of parameters for the ELECTRE II, III or IV, methods. The algorithm takes some steps of a machine…

Abstract

Purpose

This paper presents an algorithm that can elicitate all or any combination of parameters for the ELECTRE II, III or IV, methods. The algorithm takes some steps of a machine learning ensemble technique, the random forest, and for that, the authors named the approach as Ranking Trees Algorithm.

Design/methodology/approach

First, for a given method, the authors generate a set of ELECTRE models, where each model solves a random sample of criteria and actions (alternatives). Second, for each generated model, all actions are projected in a 1D space; in general, the best actions have higher values in a 1D space than the worst ones; therefore, they can be used to guide the genetic algorithm in the final step, the optimization phase. Finally, in the optimization phase, each model has its parameters optimized.

Findings

The results can be used in two different ways; the authors can merge all models, to find the elicitated parameters in this way, or the authors can ensemble the models, and the median of all ranks represents the final rank. The numerical examples achieved a Kendall Tau correlation rank over 0.85, and these results could perform as well as the results obtained by a group of specialists.

Originality/value

For the first time, the elicitation of ELECTRE parameters is made by an ensemble technique composed of a set of uncorrelated multicriteria models that can generate robust solutions.

Details

Data Technologies and Applications, vol. 55 no. 1
Type: Research Article
ISSN: 2514-9288

Keywords

1 – 10 of over 2000