Search results

1 – 10 of over 3000
Article
Publication date: 26 August 2014

Mourad Mroua and Fathi Abid

Since equity markets have a dynamic nature, the purpose of this paper is to investigate the performance of a revision procedure for domestic and international portfolios, and…

2151

Abstract

Purpose

Since equity markets have a dynamic nature, the purpose of this paper is to investigate the performance of a revision procedure for domestic and international portfolios, and provides an empirical selection strategy for optimal diversification from an American investor's point of view. This paper considers the impact of estimation errors on the optimization processes in financial portfolios.

Design/methodology/approach

This paper introduces the concept of portfolio resampling using Monte Carlo method. Statistical inferences methodology is applied to construct the sample acceptance regions and confidence regions for the resampled portfolios needing revision. Tracking error variance minimization (TEVM) problem is used to define the tracking error efficient frontiers (TEEF) referring to Roll (1992). This paper employs a computation method of the periodical after revision return performance level of the dynamic diversification strategies considering the transaction cost.

Findings

The main finding is that the global portfolio diversification benefits exist for the domestic investors, in both the mean-variance and tracking error analysis. Through TEEF, the dynamic analysis indicates that domestic dynamic diversification outperforms international major and emerging diversification strategies. Portfolio revision appears to be of no systematic benefit. Depending on the revision of the weights of the assets in the portfolio and the transaction costs, the revision policy can negatively affect the performance of an investment strategy. Considering the transaction costs of portfolios revision, the results of the return performance computation suggest the dominance of the global and the international emerging markets diversification over all other strategies. Finally, an assessment between the return and the cost of the portfolios revision strategy is necessary.

Originality/value

The innovation of this paper is to introduce a new concept of the dynamic portfolio management by considering the transaction costs. This paper investigates the performance of a revision procedure for domestic and international portfolios and provides an empirical selection strategy for optimal diversification. The originality of the idea consists on the application of a new statistical inferences methodology to define portfolios needing revision and the use of the TEVM algorithm to define the tracking error dynamic efficient frontiers.

Details

International Journal of Managerial Finance, vol. 10 no. 4
Type: Research Article
ISSN: 1743-9132

Keywords

Article
Publication date: 23 June 2022

Kerim Koc, Ömer Ekmekcioğlu and Asli Pelin Gurgun

Central to the entire discipline of construction safety management is the concept of construction accidents. Although distinctive progress has been made in safety management…

Abstract

Purpose

Central to the entire discipline of construction safety management is the concept of construction accidents. Although distinctive progress has been made in safety management applications over the last decades, construction industry still accounts for a considerable percentage of all workplace fatalities across the world. This study aims to predict occupational accident outcomes based on national data using machine learning (ML) methods coupled with several resampling strategies.

Design/methodology/approach

Occupational accident dataset recorded in Turkey was collected. To deal with the class imbalance issue between the number of nonfatal and fatal accidents, the dataset was pre-processed with random under-sampling (RUS), random over-sampling (ROS) and synthetic minority over-sampling technique (SMOTE). In addition, random forest (RF), Naïve Bayes (NB), K-Nearest neighbor (KNN) and artificial neural networks (ANNs) were employed as ML methods to predict accident outcomes.

Findings

The results highlighted that the RF outperformed other methods when the dataset was preprocessed with RUS. The permutation importance results obtained through the RF exhibited that the number of past accidents in the company, worker's age, material used, number of workers in the company, accident year, and time of the accident were the most significant attributes.

Practical implications

The proposed framework can be used in construction sites on a monthly-basis to detect workers who have a high probability to experience fatal accidents, which can be a valuable decision-making input for safety professionals to reduce the number of fatal accidents.

Social implications

Practitioners and occupational health and safety (OHS) departments of construction firms can focus on the most important attributes identified by analysis results to enhance the workers' quality of life and well-being.

Originality/value

The literature on accident outcome predictions is limited in terms of dealing with imbalanced dataset through integrated resampling techniques and ML methods in the construction safety domain. A novel utilization plan was proposed and enhanced by the analysis results.

Details

Engineering, Construction and Architectural Management, vol. 30 no. 9
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 4 December 2018

Zhongyi Hu, Raymond Chiong, Ilung Pranata, Yukun Bao and Yuqing Lin

Malicious web domain identification is of significant importance to the security protection of internet users. With online credibility and performance data, the purpose of this…

Abstract

Purpose

Malicious web domain identification is of significant importance to the security protection of internet users. With online credibility and performance data, the purpose of this paper to investigate the use of machine learning techniques for malicious web domain identification by considering the class imbalance issue (i.e. there are more benign web domains than malicious ones).

Design/methodology/approach

The authors propose an integrated resampling approach to handle class imbalance by combining the synthetic minority oversampling technique (SMOTE) and particle swarm optimisation (PSO), a population-based meta-heuristic algorithm. The authors use the SMOTE for oversampling and PSO for undersampling.

Findings

By applying eight well-known machine learning classifiers, the proposed integrated resampling approach is comprehensively examined using several imbalanced web domain data sets with different imbalance ratios. Compared to five other well-known resampling approaches, experimental results confirm that the proposed approach is highly effective.

Practical implications

This study not only inspires the practical use of online credibility and performance data for identifying malicious web domains but also provides an effective resampling approach for handling the class imbalance issue in the area of malicious web domain identification.

Originality/value

Online credibility and performance data are applied to build malicious web domain identification models using machine learning techniques. An integrated resampling approach is proposed to address the class imbalance issue. The performance of the proposed approach is confirmed based on real-world data sets with different imbalance ratios.

Article
Publication date: 8 November 2018

Amos H.C. Ng, Florian Siegmund and Kalyanmoy Deb

Stochastic simulation is a popular tool among practitioners and researchers alike for quantitative analysis of systems. Recent advancement in research on formulating production…

Abstract

Purpose

Stochastic simulation is a popular tool among practitioners and researchers alike for quantitative analysis of systems. Recent advancement in research on formulating production systems improvement problems into multi-objective optimizations has provided the possibility to predict the optimal trade-offs between improvement costs and system performance, before making the final decision for implementation. However, the fact that stochastic simulations rely on running a large number of replications to cope with the randomness and obtain some accurate statistical estimates of the system outputs, has posed a serious issue for using this kind of multi-objective optimization in practice, especially with complex models. Therefore, the purpose of this study is to investigate the performance enhancements of a reference point based evolutionary multi-objective optimization algorithm in practical production systems improvement problems, when combined with various dynamic re-sampling mechanisms.

Design/methodology/approach

Many algorithms consider the preferences of decision makers to converge to optimal trade-off solutions faster. There also exist advanced dynamic resampling procedures to avoid wasting a multitude of simulation replications to non-optimal solutions. However, very few attempts have been made to study the advantages of combining these two approaches to further enhance the performance of computationally expensive optimizations for complex production systems. Therefore, this paper proposes some combinations of preference-based guided search with dynamic resampling mechanisms into an evolutionary multi-objective optimization algorithm to lower both the computational cost in re-sampling and the total number of simulation evaluations.

Findings

This paper shows the performance enhancements of the reference-point based algorithm, R-NSGA-II, when augmented with three different dynamic resampling mechanisms with increasing degrees of statistical sophistication, namely, time-based, distance-rank and optimal computing buffer allocation, when applied to two real-world production system improvement studies. The results have shown that the more stochasticity that the simulation models exert, the more the statistically advanced dynamic resampling mechanisms could significantly enhance the performance of the optimization process.

Originality/value

Contributions of this paper include combining decision makers’ preferences and dynamic resampling procedures; performance evaluations on two real-world production system improvement studies and illustrating statistically advanced dynamic resampling mechanism is needed for noisy models.

Details

Journal of Systems and Information Technology, vol. 20 no. 4
Type: Research Article
ISSN: 1328-7265

Keywords

Book part
Publication date: 6 September 2019

Son Nguyen, Gao Niu, John Quinn, Alan Olinsky, Jonathan Ormsbee, Richard M. Smith and James Bishop

In recent years, the problem of classification with imbalanced data has been growing in popularity in the data-mining and machine-learning communities due to the emergence of an…

Abstract

In recent years, the problem of classification with imbalanced data has been growing in popularity in the data-mining and machine-learning communities due to the emergence of an abundance of imbalanced data in many fields. In this chapter, we compare the performance of six classification methods on an imbalanced dataset under the influence of four resampling techniques. These classification methods are the random forest, the support vector machine, logistic regression, k-nearest neighbor (KNN), the decision tree, and AdaBoost. Our study has shown that all of the classification methods have difficulty when working with the imbalanced data, with the KNN performing the worst, detecting only 27.4% of the minority class. However, with the help of resampling techniques, all of the classification methods experience improvement on overall performances. In particular, the Random Forest, in combination with the random over-sampling technique, performs the best, achieving 82.8% balanced accuracy (the average of the true-positive rate and true-negative rate).

We then propose a new procedure to resample the data. Our method is based on the idea of eliminating “easy” majority observations before under-sampling them. It has further improved the balanced accuracy of the Random Forest to 83.7%, making it the best approach for the imbalanced data.

Details

Advances in Business and Management Forecasting
Type: Book
ISBN: 978-1-78754-290-7

Keywords

Open Access
Article
Publication date: 6 October 2023

Xiaomei Jiang, Shuo Wang, Wenjian Liu and Yun Yang

Traditional Chinese medicine (TCM) prescriptions have always relied on the experience of TCM doctors, and machine learning(ML) provides a technical means for learning these…

Abstract

Purpose

Traditional Chinese medicine (TCM) prescriptions have always relied on the experience of TCM doctors, and machine learning(ML) provides a technical means for learning these experiences and intelligently assists in prescribing. However, in TCM prescription, there are the main (Jun) herb and the auxiliary (Chen, Zuo and Shi) herb collocations. In a prescription, the types of auxiliary herbs are often more than the main herb and the auxiliary herbs often appear in other prescriptions. This leads to different frequencies of different herbs in prescriptions, namely, imbalanced labels (herbs). As a result, the existing ML algorithms are biased, and it is difficult to predict the main herb with less frequency in the actual prediction and poor performance. In order to solve the impact of this problem, this paper proposes a framework for multi-label traditional Chinese medicine (ML-TCM) based on multi-label resampling.

Design/methodology/approach

In this work, a multi-label learning framework is proposed that adopts and compares the multi-label random resampling (MLROS), multi-label synthesized resampling (MLSMOTE) and multi-label synthesized resampling based on local label imbalance (MLSOL), three multi-label oversampling techniques to rebalance the TCM data.

Findings

The experimental results show that after resampling, the less frequent but important herbs can be predicted more accurately. The MLSOL method is shown to be the best with over 10% improvements on average because it balances the data by considering both features and labels when resampling.

Originality/value

The authors first systematically analyzed the label imbalance problem of different sampling methods in the field of TCM and provide a solution. And through the experimental results analysis, the authors proved the feasibility of this method, which can improve the performance by 10%−30% compared with the state-of-the-art methods.

Details

Journal of Electronic Business & Digital Economics, vol. 2 no. 2
Type: Research Article
ISSN: 2754-4214

Keywords

Book part
Publication date: 1 September 2021

Son Nguyen, Phyllis Schumacher, Alan Olinsky and John Quinn

We study the performances of various predictive models including decision trees, random forests, neural networks, and linear discriminant analysis on an imbalanced data set of…

Abstract

We study the performances of various predictive models including decision trees, random forests, neural networks, and linear discriminant analysis on an imbalanced data set of home loan applications. During the process, we propose our undersampling algorithm to cope with the issues created by the imbalance of the data. Our technique is shown to work competitively against popular resampling techniques such as random oversampling, undersampling, synthetic minority oversampling technique (SMOTE), and random oversampling examples (ROSE). We also investigate the relation between the true positive rate, true negative rate, and the imbalance of the data.

Article
Publication date: 29 November 2021

Ziming Zeng, Tingting Li, Shouqiang Sun, Jingjing Sun and Jie Yin

Twitter fake accounts refer to bot accounts created by third-party organizations to influence public opinion, commercial propaganda or impersonate others. The effective…

Abstract

Purpose

Twitter fake accounts refer to bot accounts created by third-party organizations to influence public opinion, commercial propaganda or impersonate others. The effective identification of bot accounts is conducive to accurately judge the disseminated information for the public. However, in actual fake account identification, it is expensive and inefficient to manually label Twitter accounts, and the labeled data are usually unbalanced in classes. To this end, the authors propose a novel framework to solve these problems.

Design/methodology/approach

In the proposed framework, the authors introduce the concept of semi-supervised self-training learning and apply it to the real Twitter account data set from Kaggle. Specifically, the authors first train the classifier in the initial small amount of labeled account data, then use the trained classifier to automatically label large-scale unlabeled account data. Next, iteratively select high confidence instances from unlabeled data to expand the labeled data. Finally, an expanded Twitter account training set is obtained. It is worth mentioning that the resampling technique is integrated into the self-training process, and the data class is balanced at the initial stage of the self-training iteration.

Findings

The proposed framework effectively improves labeling efficiency and reduces the influence of class imbalance. It shows excellent identification results on 6 different base classifiers, especially for the initial small-scale labeled Twitter accounts.

Originality/value

This paper provides novel insights in identifying Twitter fake accounts. First, the authors take the lead in introducing a self-training method to automatically label Twitter accounts from the semi-supervised background. Second, the resampling technique is integrated into the self-training process to effectively reduce the influence of class imbalance on the identification effect.

Details

Data Technologies and Applications, vol. 56 no. 3
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 10 March 2022

Aziz Kaba and Ahmet Ermeydan

The purpose of this paper is to present an improved particle filter-based attitude estimator for a quadrotor unmanned aerial vehicle (UAV) that addresses the degeneracy issues.

Abstract

Purpose

The purpose of this paper is to present an improved particle filter-based attitude estimator for a quadrotor unmanned aerial vehicle (UAV) that addresses the degeneracy issues.

Design/methodology/approach

Control of a quadrotor is not sufficient enough without an estimator to eliminate the noise from low-cost sensors. In this work, particle filter-based attitude estimator is proposed and used for nonlinear quadrotor dynamics. But, since recursive Bayesian estimation steps may rise degeneracy issues, the proposed scheme is improved with four different and widely used resampling algorithms.

Findings

Robustness of the proposed schemes is tested under various scenarios that include different levels of uncertainty and different particle sizes. Statistical analyses are conducted to assess the error performance of the schemes. According to the statistical analysis, the proposed estimators are capable of reducing sensor noise up to 5x, increasing signal to noise ratio up to 2.5x and reducing the uncertainty bounds up to 36x with root mean square value of as low as 0.0024, mean absolute error value of 0.036, respectively.

Originality/value

To the best of the authors’ knowledge, the originality of this paper is to propose a robust particle filter-based attitude estimator to eliminate the low-cost sensor errors of quadrotor UAVs.

Details

Aircraft Engineering and Aerospace Technology, vol. 94 no. 7
Type: Research Article
ISSN: 1748-8842

Keywords

Article
Publication date: 9 January 2024

Kaizheng Zhang, Jian Di, Jiulong Wang, Xinghu Wang and Haibo Ji

Many existing trajectory optimization algorithms use parameters like maximum velocity or acceleration to formulate constraints. Due to the ignoring of the quadrotor actual…

Abstract

Purpose

Many existing trajectory optimization algorithms use parameters like maximum velocity or acceleration to formulate constraints. Due to the ignoring of the quadrotor actual tracking capability, the generated trajectories may not be suitable for tracking control. The purpose of this paper is to design an online adjustment algorithm to improve the overall quadrotor trajectory tracking performance.

Design/methodology/approach

The authors propose a reference trajectory resampling layer (RTRL) to dynamically adjust the reference signals according to the current tracking status and future tracking risks. First, the authors design a risk-aware tracking monitor that uses the Frenét tracking errors and the curvature and torsion of the reference trajectory to evaluate tracking risks. Then, the authors propose an online adjusting algorithm by using the time scaling method.

Findings

The proposed RTRL is shown to be effective in improving the quadrotor trajectory tracking accuracy by both simulation and experiment results.

Originality/value

Infeasible reference trajectories may cause serious accidents for autonomous quadrotors. The results of this paper can improve the safety of autonomous quadrotor in application.

Details

Robotic Intelligence and Automation, vol. 44 no. 1
Type: Research Article
ISSN: 2754-6969

Keywords

1 – 10 of over 3000