Search results

1 – 10 of 66
Article
Publication date: 26 September 2023

Mohammed Ayoub Ledhem and Warda Moussaoui

This paper aims to apply several data mining techniques for predicting the daily precision improvement of Jakarta Islamic Index (JKII) prices based on big data of symmetric…

Abstract

Purpose

This paper aims to apply several data mining techniques for predicting the daily precision improvement of Jakarta Islamic Index (JKII) prices based on big data of symmetric volatility in Indonesia’s Islamic stock market.

Design/methodology/approach

This research uses big data mining techniques to predict daily precision improvement of JKII prices by applying the AdaBoost, K-nearest neighbor, random forest and artificial neural networks. This research uses big data with symmetric volatility as inputs in the predicting model, whereas the closing prices of JKII were used as the target outputs of daily precision improvement. For choosing the optimal prediction performance according to the criteria of the lowest prediction errors, this research uses four metrics of mean absolute error, mean squared error, root mean squared error and R-squared.

Findings

The experimental results determine that the optimal technique for predicting the daily precision improvement of the JKII prices in Indonesia’s Islamic stock market is the AdaBoost technique, which generates the optimal predicting performance with the lowest prediction errors, and provides the optimum knowledge from the big data of symmetric volatility in Indonesia’s Islamic stock market. In addition, the random forest technique is also considered another robust technique in predicting the daily precision improvement of the JKII prices as it delivers closer values to the optimal performance of the AdaBoost technique.

Practical implications

This research is filling the literature gap of the absence of using big data mining techniques in the prediction process of Islamic stock markets by delivering new operational techniques for predicting the daily stock precision improvement. Also, it helps investors to manage the optimal portfolios and to decrease the risk of trading in global Islamic stock markets based on using big data mining of symmetric volatility.

Originality/value

This research is a pioneer in using big data mining of symmetric volatility in the prediction of an Islamic stock market index.

Details

Journal of Modelling in Management, vol. 19 no. 3
Type: Research Article
ISSN: 1746-5664

Keywords

Article
Publication date: 19 January 2024

Ping Huang, Haitao Ding, Hong Chen, Jianwei Zhang and Zhenjia Sun

The growing availability of naturalistic driving datasets (NDDs) presents a valuable opportunity to develop various models for autonomous driving. However, while current NDDs…

Abstract

Purpose

The growing availability of naturalistic driving datasets (NDDs) presents a valuable opportunity to develop various models for autonomous driving. However, while current NDDs include data on vehicles with and without intended driving behavior changes, they do not explicitly demonstrate a type of data on vehicles that intend to change their driving behavior but do not execute the behaviors because of safety, efficiency, or other factors. This missing data is essential for autonomous driving decisions. This study aims to extract the driving data with implicit intentions to support the development of decision-making models.

Design/methodology/approach

According to Bayesian inference, drivers who have the same intended changes likely share similar influencing factors and states. Building on this principle, this study proposes an approach to extract data on vehicles that intended to execute specific behaviors but failed to do so. This is achieved by computing driving similarities between the candidate vehicles and benchmark vehicles with incorporation of the standard similarity metrics, which takes into account information on the surrounding vehicles' location topology and individual vehicle motion states. By doing so, the method enables a more comprehensive analysis of driving behavior and intention.

Findings

The proposed method is verified on the Next Generation SIMulation dataset (NGSim), which confirms its ability to reveal similarities between vehicles executing similar behaviors during the decision-making process in nature. The approach is also validated using simulated data, achieving an accuracy of 96.3 per cent in recognizing vehicles with specific driving behavior intentions that are not executed.

Originality/value

This study provides an innovative approach to extract driving data with implicit intentions and offers strong support to develop data-driven decision-making models for autonomous driving. With the support of this approach, the development of autonomous vehicles can capture more real driving experience from human drivers moving towards a safer and more efficient future.

Details

Data Technologies and Applications, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 11 July 2023

Abhinandan Chatterjee, Pradip Bala, Shruti Gedam, Sanchita Paul and Nishant Goyal

Depression is a mental health problem characterized by a persistent sense of sadness and loss of interest. EEG signals are regarded as the most appropriate instruments for…

Abstract

Purpose

Depression is a mental health problem characterized by a persistent sense of sadness and loss of interest. EEG signals are regarded as the most appropriate instruments for diagnosing depression because they reflect the operating status of the human brain. The purpose of this study is the early detection of depression among people using EEG signals.

Design/methodology/approach

(i) Artifacts are removed by filtering and linear and non-linear features are extracted; (ii) feature scaling is done using a standard scalar while principal component analysis (PCA) is used for feature reduction; (iii) the linear, non-linear and combination of both (only for those whose accuracy is highest) are taken for further analysis where some ML and DL classifiers are applied for the classification of depression; and (iv) in this study, total 15 distinct ML and DL methods, including KNN, SVM, bagging SVM, RF, GB, Extreme Gradient Boosting, MNB, Adaboost, Bagging RF, BootAgg, Gaussian NB, RNN, 1DCNN, RBFNN and LSTM, that have been effectively utilized as classifiers to handle a variety of real-world issues.

Findings

1. Among all, alpha, alpha asymmetry, gamma and gamma asymmetry give the best results in linear features, while RWE, DFA, CD and AE give the best results in non-linear feature. 2. In the linear features, gamma and alpha asymmetry have given 99.98% accuracy for Bagging RF, while gamma asymmetry has given 99.98% accuracy for BootAgg. 3. For non-linear features, it has been shown 99.84% of accuracy for RWE and DFA in RF, 99.97% accuracy for DFA in XGBoost and 99.94% accuracy for RWE in BootAgg. 4. By using DL, in linear features, gamma asymmetry has given more than 96% accuracy in RNN and 91% accuracy in LSTM and for non-linear features, 89% accuracy has been achieved for CD and AE in LSTM. 5. By combining linear and non-linear features, the highest accuracy was achieved in Bagging RF (98.50%) gamma asymmetry + RWE. In DL, Alpha + RWE, Gamma asymmetry + CD and gamma asymmetry + RWE have achieved 98% accuracy in LSTM.

Originality/value

A novel dataset was collected from the Central Institute of Psychiatry (CIP), Ranchi which was recorded using a 128-channels whereas major previous studies used fewer channels; the details of the study participants are summarized and a model is developed for statistical analysis using N-way ANOVA; artifacts are removed by high and low pass filtering of epoch data followed by re-referencing and independent component analysis for noise removal; linear features, namely, band power and interhemispheric asymmetry and non-linear features, namely, relative wavelet energy, wavelet entropy, Approximate entropy, sample entropy, detrended fluctuation analysis and correlation dimension are extracted; this model utilizes Epoch (213,072) for 5 s EEG data, which allows the model to train for longer, thereby increasing the efficiency of classifiers. Features scaling is done using a standard scalar rather than normalization because it helps increase the accuracy of the models (especially for deep learning algorithms) while PCA is used for feature reduction; the linear, non-linear and combination of both features are taken for extensive analysis in conjunction with ML and DL classifiers for the classification of depression. The combination of linear and non-linear features (only for those whose accuracy is highest) is used for the best detection results.

Details

Aslib Journal of Information Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 17 March 2023

Stewart Jones

This study updates the literature review of Jones (1987) published in this journal. The study pays particular attention to two important themes that have shaped the field over the…

Abstract

Purpose

This study updates the literature review of Jones (1987) published in this journal. The study pays particular attention to two important themes that have shaped the field over the past 35 years: (1) the development of a range of innovative new statistical learning methods, particularly advanced machine learning methods such as stochastic gradient boosting, adaptive boosting, random forests and deep learning, and (2) the emergence of a wide variety of bankruptcy predictor variables extending beyond traditional financial ratios, including market-based variables, earnings management proxies, auditor going concern opinions (GCOs) and corporate governance attributes. Several directions for future research are discussed.

Design/methodology/approach

This study provides a systematic review of the corporate failure literature over the past 35 years with a particular focus on the emergence of new statistical learning methodologies and predictor variables. This synthesis of the literature evaluates the strength and limitations of different modelling approaches under different circumstances and provides an overall evaluation the relative contribution of alternative predictor variables. The study aims to provide a transparent, reproducible and interpretable review of the literature. The literature review also takes a theme-centric rather than author-centric approach and focuses on structured themes that have dominated the literature since 1987.

Findings

There are several major findings of this study. First, advanced machine learning methods appear to have the most promise for future firm failure research. Not only do these methods predict significantly better than conventional models, but they also possess many appealing statistical properties. Second, there are now a much wider range of variables being used to model and predict firm failure. However, the literature needs to be interpreted with some caution given the many mixed findings. Finally, there are still a number of unresolved methodological issues arising from the Jones (1987) study that still requiring research attention.

Originality/value

The study explains the connections and derivations between a wide range of firm failure models, from simpler linear models to advanced machine learning methods such as gradient boosting, random forests, adaptive boosting and deep learning. The paper highlights the most promising models for future research, particularly in terms of their predictive power, underlying statistical properties and issues of practical implementation. The study also draws together an extensive literature on alternative predictor variables and provides insights into the role and behaviour of alternative predictor variables in firm failure research.

Details

Journal of Accounting Literature, vol. 45 no. 2
Type: Research Article
ISSN: 0737-4607

Keywords

Article
Publication date: 28 September 2023

Moh. Riskiyadi

This study aims to compare machine learning models, datasets and splitting training-testing using data mining methods to detect financial statement fraud.

3545

Abstract

Purpose

This study aims to compare machine learning models, datasets and splitting training-testing using data mining methods to detect financial statement fraud.

Design/methodology/approach

This study uses a quantitative approach from secondary data on the financial reports of companies listed on the Indonesia Stock Exchange in the last ten years, from 2010 to 2019. Research variables use financial and non-financial variables. Indicators of financial statement fraud are determined based on notes or sanctions from regulators and financial statement restatements with special supervision.

Findings

The findings show that the Extremely Randomized Trees (ERT) model performs better than other machine learning models. The best original-sampling dataset compared to other dataset treatments. Training testing splitting 80:10 is the best compared to other training-testing splitting treatments. So the ERT model with an original-sampling dataset and 80:10 training-testing splitting are the most appropriate for detecting future financial statement fraud.

Practical implications

This study can be used by regulators, investors, stakeholders and financial crime experts to add insight into better methods of detecting financial statement fraud.

Originality/value

This study proposes a machine learning model that has not been discussed in previous studies and performs comparisons to obtain the best financial statement fraud detection results. Practitioners and academics can use findings for further research development.

Details

Asian Review of Accounting, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1321-7348

Keywords

Article
Publication date: 23 September 2022

Hossein Sohrabi and Esmatullah Noorzai

The present study aims to develop a risk-supported case-based reasoning (RS-CBR) approach for water-related projects by incorporating various uncertainties and risks in the…

Abstract

Purpose

The present study aims to develop a risk-supported case-based reasoning (RS-CBR) approach for water-related projects by incorporating various uncertainties and risks in the revision step.

Design/methodology/approach

The cases were extracted by studying 68 water-related projects. This research employs earned value management (EVM) factors to consider time and cost features and economic, natural, technical, and project risks to account for uncertainties and supervised learning models to estimate cost overrun. Time-series algorithms were also used to predict construction cost indexes (CCI) and model improvements in future forecasts. Outliers were deleted by the pre-processing process. Next, datasets were split into testing and training sets, and algorithms were implemented. The accuracy of different models was measured with the mean absolute percentage error (MAPE) and the normalized root mean square error (NRSME) criteria.

Findings

The findings show an improvement in the accuracy of predictions using datasets that consider uncertainties, and ensemble algorithms such as Random Forest and AdaBoost had higher accuracy. Also, among the single algorithms, the support vector regressor (SVR) with the sigmoid kernel outperformed the others.

Originality/value

This research is the first attempt to develop a case-based reasoning model based on various risks and uncertainties. The developed model has provided an approving overlap with machine learning models to predict cost overruns. The model has been implemented in collected water-related projects and results have been reported.

Details

Engineering, Construction and Architectural Management, vol. 31 no. 2
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 1 January 2024

Shrutika Sharma, Vishal Gupta, Deepa Mudgal and Vishal Srivastava

Three-dimensional (3D) printing is highly dependent on printing process parameters for achieving high mechanical strength. It is a time-consuming and expensive operation to…

Abstract

Purpose

Three-dimensional (3D) printing is highly dependent on printing process parameters for achieving high mechanical strength. It is a time-consuming and expensive operation to experiment with different printing settings. The current study aims to propose a regression-based machine learning model to predict the mechanical behavior of ulna bone plates.

Design/methodology/approach

The bone plates were formed using fused deposition modeling (FDM) technique, with printing attributes being varied. The machine learning models such as linear regression, AdaBoost regression, gradient boosting regression (GBR), random forest, decision trees and k-nearest neighbors were trained for predicting tensile strength and flexural strength. Model performance was assessed using root mean square error (RMSE), coefficient of determination (R2) and mean absolute error (MAE).

Findings

Traditional experimentation with various settings is both time-consuming and expensive, emphasizing the need for alternative approaches. Among the models tested, GBR model demonstrated the best performance in predicting both tensile and flexural strength and achieved the lowest RMSE, highest R2 and lowest MAE, which are 1.4778 ± 0.4336 MPa, 0.9213 ± 0.0589 and 1.2555 ± 0.3799 MPa, respectively, and 3.0337 ± 0.3725 MPa, 0.9269 ± 0.0293 and 2.3815 ± 0.2915 MPa, respectively. The findings open up opportunities for doctors and surgeons to use GBR as a reliable tool for fabricating patient-specific bone plates, without the need for extensive trial experiments.

Research limitations/implications

The current study is limited to the usage of a few models. Other machine learning-based models can be used for prediction-based study.

Originality/value

This study uses machine learning to predict the mechanical properties of FDM-based distal ulna bone plate, replacing traditional design of experiments methods with machine learning to streamline the production of orthopedic implants. It helps medical professionals, such as physicians and surgeons, make informed decisions when fabricating customized bone plates for their patients while reducing the need for time-consuming experimentation, thereby addressing a common limitation of 3D printing medical implants.

Details

Rapid Prototyping Journal, vol. 30 no. 3
Type: Research Article
ISSN: 1355-2546

Keywords

Open Access
Article
Publication date: 22 June 2023

Ignacio Manuel Luque Raya and Pablo Luque Raya

Having defined liquidity, the aim is to assess the predictive capacity of its representative variables, so that economic fluctuations may be better understood.

Abstract

Purpose

Having defined liquidity, the aim is to assess the predictive capacity of its representative variables, so that economic fluctuations may be better understood.

Design/methodology/approach

Conceptual variables that are representative of liquidity will be used to formulate the predictions. The results of various machine learning models will be compared, leading to some reflections on the predictive value of the liquidity variables, with a view to defining their selection.

Findings

The predictive capacity of the model was also found to vary depending on the source of the liquidity, in so far as the data on liquidity within the private sector contributed more than the data on public sector liquidity to the prediction of economic fluctuations. International liquidity was seen as a more diffuse concept, and the standardization of its definition could be the focus of future studies. A benchmarking process was also performed when applying the state-of-the-art machine learning models.

Originality/value

Better understanding of these variables might help us toward a deeper understanding of the operation of financial markets. Liquidity, one of the key financial market variables, is neither well-defined nor standardized in the existing literature, which calls for further study. Hence, the novelty of an applied study employing modern data science techniques can provide a fresh perspective on financial markets.

流動資金,無論是在金融市場方面,抑或是在實體經濟方面,均為市場趨勢最明確的預報因素之一

因此,就了解經濟週期和經濟發展而言,流動資金是一個極其重要的概念。本研究擬在安全資產的價格預測方面取得進步。安全資產代表了經濟的實際情況,特別是美國的十年期國債。

研究目的

流動資金的定義上面已說明了; 為進一步了解經濟波動,本研究擬對流動資金代表性變量的預測能力進行評估。

研究方法

研究使用作為流動資金代表的概念變項去規劃預測。各機器學習模型的結果會作比較,這會帶來對流動資金變量的預測值的深思,而深思的目的是確定其選擇。

研究結果

只要在私營部門內流動資金的數據比公營部門的流動資金數據、在預測經濟波動方面貢獻更大時,我們發現、模型的預測能力也會依賴流動資金的來源而存在差異。國際流動資金被視為一個晦澀的概念,而它的定義的標準化,或許應是未來學術研究的焦點。當應用最先進的機器學習模型時,標桿分析法的步驟也施行了。

研究的原創性

若我們對有關的變量加深認識,我們就可更深入地理解金融市場的運作。流動資金,雖是金融市場中一個極其重要的變量,但在現存的學術文獻裏,不但沒有明確的定義,而且也沒有被標準化; 就此而言,未來的研究或許可在這方面作進一步的探討。因此,本研究為富有新穎思維的應用研究,研究使用了現代數據科學技術,這可為探討金融市場提供一個全新的視角。

Details

European Journal of Management and Business Economics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2444-8451

Keywords

Article
Publication date: 10 March 2022

Jayaram Boga and Dhilip Kumar V.

For achieving the profitable human activity recognition (HAR) method, this paper solves the HAR problem under wireless body area network (WBAN) using a developed ensemble learning…

94

Abstract

Purpose

For achieving the profitable human activity recognition (HAR) method, this paper solves the HAR problem under wireless body area network (WBAN) using a developed ensemble learning approach. The purpose of this study is,to solve the HAR problem under WBAN using a developed ensemble learning approach for achieving the profitable HAR method. There are three data sets used for this HAR in WBAN, namely, human activity recognition using smartphones, wireless sensor data mining and Kaggle. The proposed model undergoes four phases, namely, “pre-processing, feature extraction, feature selection and classification.” Here, the data can be preprocessed by artifacts removal and median filtering techniques. Then, the features are extracted by techniques such as “t-Distributed Stochastic Neighbor Embedding”, “Short-time Fourier transform” and statistical approaches. The weighted optimal feature selection is considered as the next step for selecting the important features based on computing the data variance of each class. This new feature selection is achieved by the hybrid coyote Jaya optimization (HCJO). Finally, the meta-heuristic-based ensemble learning approach is used as a new recognition approach with three classifiers, namely, “support vector machine (SVM), deep neural network (DNN) and fuzzy classifiers.” Experimental analysis is performed.

Design/methodology/approach

The proposed HCJO algorithm was developed for optimizing the membership function of fuzzy, iteration limit of SVM and hidden neuron count of DNN for getting superior classified outcomes and to enhance the performance of ensemble classification.

Findings

The accuracy for enhanced HAR model was pretty high in comparison to conventional models, i.e. higher than 6.66% to fuzzy, 4.34% to DNN, 4.34% to SVM, 7.86% to ensemble and 6.66% to Improved Sealion optimization algorithm-Attention Pyramid-Convolutional Neural Network-AP-CNN, respectively.

Originality/value

The suggested HAR model with WBAN using HCJO algorithm is accurate and improves the effectiveness of the recognition.

Details

International Journal of Pervasive Computing and Communications, vol. 19 no. 4
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 3 April 2024

Rizwan Ali, Jin Xu, Mushahid Hussain Baig, Hafiz Saif Ur Rehman, Muhammad Waqas Aslam and Kaleem Ullah Qasim

This study aims to endeavour to decode artificial intelligence (AI)-based tokens' complex dynamics and predictability using a comprehensive multivariate framework that integrates…

Abstract

Purpose

This study aims to endeavour to decode artificial intelligence (AI)-based tokens' complex dynamics and predictability using a comprehensive multivariate framework that integrates technical and macroeconomic indicators.

Design/methodology/approach

In this study we used advance machine learning techniques, such as gradient boosting regression (GBR), random forest (RF) and notably long short-term memory (LSTM) networks, this research provides a nuanced understanding of the factors driving the performance of AI tokens. The study’s comparative analysis highlights the superior predictive capabilities of LSTM models, as evidenced by their performance across various AI digital tokens such as AGIX-singularity-NET, Cortex and numeraire NMR.

Findings

This study finding shows that through an intricate exploration of feature importance and the impact of speculative behaviour, the research elucidates the long-term patterns and resilience of AI-based tokens against economic shifts. The SHapley Additive exPlanations (SHAP) analysis results show that technical and some macroeconomic factors play a dominant role in price production. It also examines the potential of these models for strategic investment and hedging, underscoring their relevance in an increasingly digital economy.

Originality/value

According to our knowledge, the absence of AI research frameworks for forecasting and modelling current aria-leading AI tokens is apparent. Due to a lack of study on understanding the relationship between the AI token market and other factors, forecasting is outstandingly demanding. This study provides a robust predictive framework to accurately identify the changing trends of AI tokens within a multivariate context and fill the gaps in existing research. We can investigate detailed predictive analytics with the help of modern AI algorithms and correct model interpretation to elaborate on the behaviour patterns of developing decentralised digital AI-based token prices.

Details

Journal of Economic Studies, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0144-3585

Keywords

1 – 10 of 66