Search results

1 – 10 of over 2000
Book part
Publication date: 4 August 2014

M. Laura Frigotto, Graziano Coller and Paolo Collini

Exploration and exploitation comprise one of the most well-known constructs in management and organization studies. However, there are three gaps in the extant literature on this…

Abstract

Exploration and exploitation comprise one of the most well-known constructs in management and organization studies. However, there are three gaps in the extant literature on this topic. First, these studies focus mainly on large organizations and neglect small and medium-sized enterprises (SMEs) and new ventures. Second, when adopting a longitudinal perspective, the research typically consists of cross-sectional studies that fail to capture evolution. Third, the research focuses more on the role of antecedents and mediators of strategies that pursue exploration and exploitation than on the practices that embody such goals. In this chapter, we address these three gaps and complement the previous literature with a study of the growth of an SME from start-up to sale over a 19-year period (1993–2011). We depict the evolution of exploration and exploitation over time through an analysis of management system practices that employs a longitudinal perspective. We analyze the different roles that management systems have played in various stages of the growth paths of the organization. We show that the role of management systems in shaping exploration and exploitation only loosely depends on the design of these systems. The same management systems can fulfill an explorative function in one stage and an exploitative function in another, depending on how such systems are used. Conversely, across stages, the role of management systems typically changes from exploration to exploitation.

Details

Exploration and Exploitation in Early Stage Ventures and SMEs
Type: Book
ISBN: 978-1-78350-655-2

Keywords

Article
Publication date: 15 March 2021

Putta Hemalatha and Geetha Mary Amalanathan

Adequate resources for learning and training the data are an important constraint to develop an efficient classifier with outstanding performance. The data usually follows a…

Abstract

Purpose

Adequate resources for learning and training the data are an important constraint to develop an efficient classifier with outstanding performance. The data usually follows a biased distribution of classes that reflects an unequal distribution of classes within a dataset. This issue is known as the imbalance problem, which is one of the most common issues occurring in real-time applications. Learning of imbalanced datasets is a ubiquitous challenge in the field of data mining. Imbalanced data degrades the performance of the classifier by producing inaccurate results.

Design/methodology/approach

In the proposed work, a novel fuzzy-based Gaussian synthetic minority oversampling (FG-SMOTE) algorithm is proposed to process the imbalanced data. The mechanism of the Gaussian SMOTE technique is based on finding the nearest neighbour concept to balance the ratio between minority and majority class datasets. The ratio of the datasets belonging to the minority and majority class is balanced using a fuzzy-based Levenshtein distance measure technique.

Findings

The performance and the accuracy of the proposed algorithm is evaluated using the deep belief networks classifier and the results showed the efficiency of the fuzzy-based Gaussian SMOTE technique achieved an AUC: 93.7%. F1 Score Prediction: 94.2%, Geometric Mean Score: 93.6% predicted from confusion matrix.

Research limitations/implications

The proposed research still retains some of the challenges that need to be focused such as application FG-SMOTE to multiclass imbalanced dataset and to evaluate dataset imbalance problem in a distributed environment.

Originality/value

The proposed algorithm fundamentally solves the data imbalance issues and challenges involved in handling the imbalanced data. FG-SMOTE has aided in balancing minority and majority class datasets.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 14 no. 2
Type: Research Article
ISSN: 1756-378X

Keywords

Book part
Publication date: 26 October 2017

Son Nguyen, John Quinn and Alan Olinsky

We propose an oversampling technique to increase the true positive rate (sensitivity) in classifying imbalanced datasets (i.e., those with a value for the target variable that…

Abstract

We propose an oversampling technique to increase the true positive rate (sensitivity) in classifying imbalanced datasets (i.e., those with a value for the target variable that occurs with a small frequency) and hence boost the overall performance measurements such as balanced accuracy, G-mean and area under the receiver operating characteristic (ROC) curve, AUC. This oversampling method is based on the idea of applying the Synthetic Minority Oversampling Technique (SMOTE) on only a selective portion of the dataset instead of the entire dataset. We demonstrate the effectiveness of our oversampling method with four real and simulated datasets generated from three models.

Details

Advances in Business and Management Forecasting
Type: Book
ISBN: 978-1-78743-069-3

Keywords

Article
Publication date: 23 June 2022

Kerim Koc, Ömer Ekmekcioğlu and Asli Pelin Gurgun

Central to the entire discipline of construction safety management is the concept of construction accidents. Although distinctive progress has been made in safety management…

1043

Abstract

Purpose

Central to the entire discipline of construction safety management is the concept of construction accidents. Although distinctive progress has been made in safety management applications over the last decades, construction industry still accounts for a considerable percentage of all workplace fatalities across the world. This study aims to predict occupational accident outcomes based on national data using machine learning (ML) methods coupled with several resampling strategies.

Design/methodology/approach

Occupational accident dataset recorded in Turkey was collected. To deal with the class imbalance issue between the number of nonfatal and fatal accidents, the dataset was pre-processed with random under-sampling (RUS), random over-sampling (ROS) and synthetic minority over-sampling technique (SMOTE). In addition, random forest (RF), Naïve Bayes (NB), K-Nearest neighbor (KNN) and artificial neural networks (ANNs) were employed as ML methods to predict accident outcomes.

Findings

The results highlighted that the RF outperformed other methods when the dataset was preprocessed with RUS. The permutation importance results obtained through the RF exhibited that the number of past accidents in the company, worker's age, material used, number of workers in the company, accident year, and time of the accident were the most significant attributes.

Practical implications

The proposed framework can be used in construction sites on a monthly-basis to detect workers who have a high probability to experience fatal accidents, which can be a valuable decision-making input for safety professionals to reduce the number of fatal accidents.

Social implications

Practitioners and occupational health and safety (OHS) departments of construction firms can focus on the most important attributes identified by analysis results to enhance the workers' quality of life and well-being.

Originality/value

The literature on accident outcome predictions is limited in terms of dealing with imbalanced dataset through integrated resampling techniques and ML methods in the construction safety domain. A novel utilization plan was proposed and enhanced by the analysis results.

Details

Engineering, Construction and Architectural Management, vol. 30 no. 9
Type: Research Article
ISSN: 0969-9988

Keywords

Content available
Article
Publication date: 1 October 2003

543

Abstract

Details

Disaster Prevention and Management: An International Journal, vol. 12 no. 4
Type: Research Article
ISSN: 0965-3562

Article
Publication date: 9 April 2024

Lu Wang, Jiahao Zheng, Jianrong Yao and Yuangao Chen

With the rapid growth of the domestic lending industry, assessing whether the borrower of each loan is at risk of default is a pressing issue for financial institutions. Although…

Abstract

Purpose

With the rapid growth of the domestic lending industry, assessing whether the borrower of each loan is at risk of default is a pressing issue for financial institutions. Although there are some models that can handle such problems well, there are still some shortcomings in some aspects. The purpose of this paper is to improve the accuracy of credit assessment models.

Design/methodology/approach

In this paper, three different stages are used to improve the classification performance of LSTM, so that financial institutions can more accurately identify borrowers at risk of default. The first approach is to use the K-Means-SMOTE algorithm to eliminate the imbalance within the class. In the second step, ResNet is used for feature extraction, and then two-layer LSTM is used for learning to strengthen the ability of neural networks to mine and utilize deep information. Finally, the model performance is improved by using the IDWPSO algorithm for optimization when debugging the neural network.

Findings

On two unbalanced datasets (category ratios of 700:1 and 3:1 respectively), the multi-stage improved model was compared with ten other models using accuracy, precision, specificity, recall, G-measure, F-measure and the nonparametric Wilcoxon test. It was demonstrated that the multi-stage improved model showed a more significant advantage in evaluating the imbalanced credit dataset.

Originality/value

In this paper, the parameters of the ResNet-LSTM hybrid neural network, which can fully mine and utilize the deep information, are tuned by an innovative intelligent optimization algorithm to strengthen the classification performance of the model.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 28 May 2024

Kuo-Yi Lin and Thitipong Jamrus

Motivated by recent research indicating the significant challenges posed by imbalanced datasets in industrial settings, this paper presents a novel framework for Industrial…

72

Abstract

Purpose

Motivated by recent research indicating the significant challenges posed by imbalanced datasets in industrial settings, this paper presents a novel framework for Industrial Data-driven Modeling for Imbalanced Fault Diagnosis, aiming to improve fault detection accuracy and reliability.

Design/methodology/approach

This study addressing the challenge of imbalanced datasets in predicting hard drive failures is both innovative and comprehensive. By integrating data enhancement techniques with cost-sensitive methods, the research pioneers a solution that directly targets the intrinsic issues posed by imbalanced data, a common obstacle in predictive maintenance and reliability analysis.

Findings

In real industrial environments, there is a critical demand for addressing the issue of imbalanced datasets. When faced with limited data for rare events or a heavily skewed distribution of categories, it becomes essential for models to effectively mine insights from the original imbalanced dataset. This involves employing techniques like data augmentation to generate new insights and rules, enhancing the model’s ability to accurately identify and predict failures.

Originality/value

Previous research has highlighted the complexity of diagnosing faults within imbalanced industrial datasets, often leading to suboptimal predictive accuracy. This paper bridges this gap by introducing a robust framework for Industrial Data-driven Modeling for Imbalanced Fault Diagnosis. It combines data enhancement and cost-sensitive methods to effectively manage the challenges posed by imbalanced datasets, further innovating with a bagging method to refine model optimization. The validation of the proposed approach demonstrates superior accuracy compared to existing methods, showcasing its potential to significantly improve fault diagnosis in industrial applications.

Details

Industrial Management & Data Systems, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0263-5577

Keywords

Book part
Publication date: 1 September 2021

Son Nguyen, Phyllis Schumacher, Alan Olinsky and John Quinn

We study the performances of various predictive models including decision trees, random forests, neural networks, and linear discriminant analysis on an imbalanced data set of…

Abstract

We study the performances of various predictive models including decision trees, random forests, neural networks, and linear discriminant analysis on an imbalanced data set of home loan applications. During the process, we propose our undersampling algorithm to cope with the issues created by the imbalance of the data. Our technique is shown to work competitively against popular resampling techniques such as random oversampling, undersampling, synthetic minority oversampling technique (SMOTE), and random oversampling examples (ROSE). We also investigate the relation between the true positive rate, true negative rate, and the imbalance of the data.

Article
Publication date: 4 December 2018

Zhongyi Hu, Raymond Chiong, Ilung Pranata, Yukun Bao and Yuqing Lin

Malicious web domain identification is of significant importance to the security protection of internet users. With online credibility and performance data, the purpose of this…

Abstract

Purpose

Malicious web domain identification is of significant importance to the security protection of internet users. With online credibility and performance data, the purpose of this paper to investigate the use of machine learning techniques for malicious web domain identification by considering the class imbalance issue (i.e. there are more benign web domains than malicious ones).

Design/methodology/approach

The authors propose an integrated resampling approach to handle class imbalance by combining the synthetic minority oversampling technique (SMOTE) and particle swarm optimisation (PSO), a population-based meta-heuristic algorithm. The authors use the SMOTE for oversampling and PSO for undersampling.

Findings

By applying eight well-known machine learning classifiers, the proposed integrated resampling approach is comprehensively examined using several imbalanced web domain data sets with different imbalance ratios. Compared to five other well-known resampling approaches, experimental results confirm that the proposed approach is highly effective.

Practical implications

This study not only inspires the practical use of online credibility and performance data for identifying malicious web domains but also provides an effective resampling approach for handling the class imbalance issue in the area of malicious web domain identification.

Originality/value

Online credibility and performance data are applied to build malicious web domain identification models using machine learning techniques. An integrated resampling approach is proposed to address the class imbalance issue. The performance of the proposed approach is confirmed based on real-world data sets with different imbalance ratios.

Article
Publication date: 22 October 2018

Sihem Khemakhem, Fatma Ben Said and Younes Boujelbene

Credit scoring datasets are generally unbalanced. The number of repaid loans is higher than that of defaulted ones. Therefore, the classification of these data is biased toward…

1080

Abstract

Purpose

Credit scoring datasets are generally unbalanced. The number of repaid loans is higher than that of defaulted ones. Therefore, the classification of these data is biased toward the majority class, which practically means that it tends to attribute a mistaken “good borrower” status even to “very risky borrowers”. In addition to the use of statistics and machine learning classifiers, this paper aims to explore the relevance and performance of sampling models combined with statistical prediction and artificial intelligence techniques to predict and quantify the default probability based on real-world credit data.

Design/methodology/approach

A real database from a Tunisian commercial bank was used and unbalanced data issues were addressed by the random over-sampling (ROS) and synthetic minority over-sampling technique (SMOTE). Performance was evaluated in terms of the confusion matrix and the receiver operating characteristic curve.

Findings

The results indicated that the combination of intelligent and statistical techniques and re-sampling approaches are promising for the default rate management and provide accurate credit risk estimates.

Originality/value

This paper empirically investigates the effectiveness of ROS and SMOTE in combination with logistic regression, artificial neural networks and support vector machines. The authors address the role of sampling strategies in the Tunisian credit market and its impact on credit risk. These sampling strategies may help financial institutions to reduce the erroneous classification costs in comparison with the unbalanced original data and may serve as a means for improving the bank’s performance and competitiveness.

Details

Journal of Modelling in Management, vol. 13 no. 4
Type: Research Article
ISSN: 1746-5664

Keywords

1 – 10 of over 2000