Search results

1 – 10 of 85

View access options

Article

Publication date: 13 September 2024

SigBERT: vibration-based steel frame structural damage detection through fine-tuning BERT

Ahmad Honarjoo, Ehsan Darvishan, Hassan Rezazadeh and Amir Homayoon Kosarieh

This article introduces SigBERT, a novel approach that fine-tunes bidirectional encoder representations from transformers (BERT) for the purpose of distinguishing between intact…

HTML

PDF (13.4 MB)

Downloads

Abstract

Purpose

This article introduces SigBERT, a novel approach that fine-tunes bidirectional encoder representations from transformers (BERT) for the purpose of distinguishing between intact and impaired structures by analyzing vibration signals. Structural health monitoring (SHM) systems are crucial for identifying and locating damage in civil engineering structures. The proposed method aims to improve upon existing methods in terms of cost-effectiveness, accuracy and operational reliability.

Design/methodology/approach

SigBERT employs a fine-tuning process on the BERT model, leveraging its capabilities to effectively analyze time-series data from vibration signals to detect structural damage. This study compares SigBERT's performance with baseline models to demonstrate its superior accuracy and efficiency.

Findings

The experimental results, obtained through the Qatar University grandstand simulator, show that SigBERT outperforms existing models in terms of damage detection accuracy. The method is capable of handling environmental fluctuations and offers high reliability for non-destructive monitoring of structural health. The study mentions the quantifiable results of the study, such as achieving a 99% accuracy rate and an F-1 score of 0.99, to underline the effectiveness of the proposed model.

Originality/value

SigBERT presents a significant advancement in SHM by integrating deep learning with a robust transformer model. The method offers improved performance in both computational efficiency and diagnostic accuracy, making it suitable for real-world operational environments.

Details

International Journal of Structural Integrity, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 1757-9864

Keywords

View access options

Article

Publication date: 26 February 2024

Financial distress prediction based on ensemble feature selection and improved stacking algorithm

Chong Wu, Xiaofang Chen and Yongjie Jiang

While the Chinese securities market is booming, the phenomenon of listed companies falling into financial distress is also emerging, which affects the operation and development of…

HTML

PDF (2.5 MB)

Downloads

185

Abstract

Purpose

While the Chinese securities market is booming, the phenomenon of listed companies falling into financial distress is also emerging, which affects the operation and development of enterprises and also jeopardizes the interests of investors. Therefore, it is important to understand how to accurately and reasonably predict the financial distress of enterprises.

Design/methodology/approach

In the present study, ensemble feature selection (EFS) and improved stacking were used for financial distress prediction (FDP). Mutual information, analysis of variance (ANOVA), random forest (RF), genetic algorithms, and recursive feature elimination (RFE) were chosen for EFS to select features. Since there may be missing information when feeding the results of the base learner directly into the meta-learner, the features with high importance were fed into the meta-learner together. A screening layer was added to select the meta-learner with better performance. Finally, Optima hyperparameters were used for parameter tuning by the learners.

Findings

An empirical study was conducted with a sample of A-share listed companies in China. The F1-score of the model constructed using the features screened by EFS reached 84.55%, representing an improvement of 4.37% compared to the original features. To verify the effectiveness of improved stacking, benchmark model comparison experiments were conducted. Compared to the original stacking model, the accuracy of the improved stacking model was improved by 0.44%, and the F1-score was improved by 0.51%. In addition, the improved stacking model had the highest area under the curve (AUC) value (0.905) among all the compared models.

Originality/value

Compared to previous models, the proposed FDP model has better performance, thus bridging the research gap of feature selection. The present study provides new ideas for stacking improvement research and a reference for subsequent research in this field.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 0368-492X

Keywords

Open Access

Article

Publication date: 23 November 2023

Arabic stance detection of COVID-19 vaccination using transformer-based approaches: a comparison study

Reema Khaled AlRowais and Duaa Alsaeed

Automatically extracting stance information from natural language texts is a significant research problem with various applications, particularly after the recent explosion of…

HTML

PDF (2 MB)

Downloads

401

Abstract

Purpose

Automatically extracting stance information from natural language texts is a significant research problem with various applications, particularly after the recent explosion of data on the internet via platforms like social media sites. Stance detection system helps determine whether the author agree, against or has a neutral opinion with the given target. Most of the research in stance detection focuses on the English language, while few research was conducted on the Arabic language.

Design/methodology/approach

This paper aimed to address stance detection on Arabic tweets by building and comparing different stance detection models using four transformers, namely: Araelectra, MARBERT, AraBERT and Qarib. Using different weights for these transformers, the authors performed extensive experiments fine-tuning the task of stance detection Arabic tweets with the four different transformers.

Findings

The results showed that the AraBERT model learned better than the other three models with a 70% F1 score followed by the Qarib model with a 68% F1 score.

Research limitations/implications

A limitation of this study is the imbalanced dataset and the limited availability of annotated datasets of SD in Arabic.

Originality/value

Provide comprehensive overview of the current resources for stance detection in the literature, including datasets and machine learning methods used. Therefore, the authors examined the models to analyze and comprehend the obtained findings in order to make recommendations for the best performance models for the stance detection task.

Details

Arab Gulf Journal of Scientific Research, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 1985-9899

Keywords

View access options

Article

Publication date: 27 August 2024

Machine learning insights: probing the variable importance of ex-ante information

Ali Albada, Eimad Eldin Abusham, Chui Zi Ong and Khalid Al Qatiti

Empirical examinations of initial public offering (IPO) initial returns often rely heavily on linear regression models. However, these models can prove inefficient owing to their…

HTML

PDF (6.8 MB)

Downloads

Abstract

Purpose

Empirical examinations of initial public offering (IPO) initial returns often rely heavily on linear regression models. However, these models can prove inefficient owing to their susceptibility to outliers, a common occurrence in IPO data. This study introduces a machine learning method, known as random forest, to address issues that linear regression may struggle to resolve.

Design/methodology/approach

The study’s sample comprises 352 fixed-priced IPOs from the year 2004 until 2021. A unique aspect of this research is its application of the random forest method. The accuracy of random forest in comparison to other methods is evaluated. The findings indicate that the random forest model significantly outperforms other methods in all of the evaluated aspects.

Findings

The variable importance measure indicates that investors’ demand, divergence of opinion among investors and offer price are the most crucial predictors of IPO initial returns. These determinants hold particular significance due to the widespread use of the fixed-price method in Malaysia, as this method amplifies the information asymmetry in the IPO market.

Originality/value

To the best of the authors’ knowledge, this study is among the pioneering works in Malaysian literature to apply the random forest method to address the constraints of conventional linear regression models. This is achieved by considering a more extensive array of factors and acknowledging the influence of outliers. Additionally, this study adds value to Malaysian literature by ranking and identifying the ex-ante information that best signals the issuing firm’s quality. This contribution facilitates prospective investors’ decision-making processes and provides issuing firms with effective means to communicate their value and quality to the IPO market.

Details

Managerial Finance, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 0307-4358

Keywords

View access options

Article

Publication date: 17 July 2024

Uncovering corporate greenwashing: a predictive model based on Chinese heavy-pollution industries

Qiang Li, Zichun He and Huaxia Li

As the global emphasis on environmental consciousness intensifies, many corporations claim to be environmentally responsible. However, some merely partake in “greenwashing” – a…

HTML

PDF (258 KB)

Downloads

204

Abstract

Purpose

As the global emphasis on environmental consciousness intensifies, many corporations claim to be environmentally responsible. However, some merely partake in “greenwashing” – a facade of eco-responsibility. Such deceptive behavior is especially prevalent in Chinese heavy-pollution industries. To counter these deceptive practices, this study aims to use machine learning (ML) techniques to develop predictive models against corporate greenwashing, thus facilitating the sustainable development of corporations.

Design/methodology/approach

This study develops effective predictive models for greenwashing by integrating multifaceted data sets, which include corporate external, organizational and managerial characteristics, and using a range of ML algorithms, namely, linear regression, random forest, K-nearest neighbors, support vector machines and artificial neural network.

Findings

The proposed predictive models register an improvement of over 20% in prediction accuracy compared to the benchmark value, furnishing stakeholders with a robust tool to challenge corporate greenwashing behaviors. Further analysis of feature importance, industry-specific predictions and real-world validation enhances the model’s interpretability and its practical applications across different domains.

Practical implications

This research introduces an innovative ML-based model designed to predict greenwashing activities within Chinese heavy-pollution sectors. It holds potential for application in other emerging economies, serving as a practical tool for both academics and practitioners.

Social implications

The findings offer insights for crafting informed, data-driven policies to curb greenwashing and promote corporate responsibility, transparency and sustainable development.

Originality/value

While prior research mainly concentrated on the factors influencing greenwashing behavior, this study takes a proactive approach. It aims to forecast the extent of corporate greenwashing by using a range of multi-dimensional variables, thus providing enhanced value to stakeholders. To the best of the authors’ knowledge, this is the first study introducing ML-based models designed to predict a company’s level of greenwashing.

Details

Sustainability Accounting, Management and Policy Journal, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 2040-8021

Keywords

View access options

Article

Publication date: 28 May 2024

Online review data analytics to explore factors affecting consumers’ airport recommendations

Cheong Kim, Jungwoo Lee and Kun Chang Lee

The main objective of this study is to determine the factors that have the greatest impact on travelers' opinions of airports.

HTML

PDF (1.3 MB)

Downloads

110

Abstract

Purpose

The main objective of this study is to determine the factors that have the greatest impact on travelers' opinions of airports.

Design/methodology/approach

11,656 customer reviews for 649 airports around the world were gathered following the COVID-19 outbreak from the website that rates airport quality. The dataset was examined using hierarchical regression, PLS-SEM, and the unsupervised Bayesian algorithm-based PSEM in order to verify the hypothesis.

Findings

The results showed that people’s intentions to recommend airports are significantly influenced by their opinions of how well the servicescape, staff, and services are.

Practical implications

By encouraging air travelers to have positive intentions toward recommending the airports, this research offers airport managers decision-support implications for how to improve airport service quality. This will increase the likelihood of retaining more passengers.

Originality/value

This study also suggests a quick-to-implement visual decision-making mechanism based on PSEM that is simple to understand.

Details

Information Technology & People, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 0959-3845

Keywords

View access options

Article

Publication date: 29 September 2023

Predicting firms’ resilience to economic crisis using artificial intelligence for optimizing economic stimulus programs

Niki Kyriakou, Euripidis N. Loukis and Manolis Maragoudakis

This study aims to develop a methodology for predicting the resilience of individual firms to economic crisis, using historical government data to optimize one of the most…

HTML

PDF (356 KB)

Downloads

124

Abstract

Purpose

This study aims to develop a methodology for predicting the resilience of individual firms to economic crisis, using historical government data to optimize one of the most important and costly interventions that governments undertake, the huge economic stimulus programs that governments implement for mitigating the consequences of economic crises, by making them more focused on the less resilient and more vulnerable firms to the crisis, which have the highest need for government assistance and support.

Design/methodology/approach

The authors are leveraging existing firm-level data for economic crisis periods from government agencies having competencies/responsibilities in the domain of economy, such as Ministries of Finance and Statistical Authorities, to construct prediction models of the resilience of individual firms to the economic crisis based on firms’ characteristics (such as human resources, technology, strategies, processes and structure), using artificial intelligence (AI) techniques from the area of machine learning (ML).

Findings

The methodology has been applied using data from the Greek Ministry of Finance and Statistical Authority about 363 firms for the Greek economic crisis period 2009–2014 and has provided a satisfactory prediction of a measure of the resilience of individual firms to an economic crisis.

Research limitations/implications

The authors’ study opens up new research directions concerning the exploitation of AI/ML in government for a critical government activity/intervention of high importance that mobilizes/spends huge financial resources. The main limitation is that the abovementioned first application of the proposed methodology has been based on a rather small data set from a single national context (Greece), so it is necessary to proceed to further application of this methodology using larger data sets and different national contexts.

Practical implications

The proposed methodology enables government agencies responsible for the implementation of such economic stimulus programs to proceed to radical transformations of them by predicting the resilience to economic crisis of the firms applying for government assistance and then directing/focusing the scarce available financial resources to/on the ones predicted to be more vulnerable, increasing substantially the effectiveness of these programs and the economic/social value they generate.

Originality/value

To the best of the authors’ knowledge, this study is the first application of AI/ML in government that leverages existing data for economic crisis periods to optimize and increase the effectiveness of the largest and most important and costly economic intervention that governments repeatedly have to make: the economic stimulus programs for mitigating the consequences of economic crises.

Details

Transforming Government: People, Process and Policy, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 1750-6166

Keywords

Open Access

Article

Publication date: 1 April 2021

Robust dual-tone multi-frequency tone detection using k-nearest neighbour classifier for a noisy environment

Arunit Maity, P. Prakasam and Sarthak Bhargava

Due to the continuous and rapid evolution of telecommunication equipment, the demand for more efficient and noise-robust detection of dual-tone multi-frequency (DTMF) signals is…

HTML

PDF (1.9 MB)

Downloads

1426

Abstract

Purpose

Due to the continuous and rapid evolution of telecommunication equipment, the demand for more efficient and noise-robust detection of dual-tone multi-frequency (DTMF) signals is most significant.

Design/methodology/approach

A novel machine learning-based approach to detect DTMF tones affected by noise, frequency and time variations by employing the k-nearest neighbour (KNN) algorithm is proposed. The features required for training the proposed KNN classifier are extracted using Goertzel's algorithm that estimates the absolute discrete Fourier transform (DFT) coefficient values for the fundamental DTMF frequencies with or without considering their second harmonic frequencies. The proposed KNN classifier model is configured in four different manners which differ in being trained with or without augmented data, as well as, with or without the inclusion of second harmonic frequency DFT coefficient values as features.

Findings

It is found that the model which is trained using the augmented data set and additionally includes the absolute DFT values of the second harmonic frequency values for the eight fundamental DTMF frequencies as the features, achieved the best performance with a macro classification F1 score of 0.980835, a five-fold stratified cross-validation accuracy of 98.47% and test data set detection accuracy of 98.1053%.

Originality/value

The generated DTMF signal has been classified and detected using the proposed KNN classifier which utilizes the DFT coefficient along with second harmonic frequencies for better classification. Additionally, the proposed KNN classifier has been compared with existing models to ascertain its superiority and proclaim its state-of-the-art performance.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 2634-1964

Keywords

View access options

Article

Publication date: 3 September 2024

Bayesian-optimized extreme gradient boosting models for classification problems: an experimental analysis of product return case

Biplab Bhattacharjee, Kavya Unni and Maheshwar Pratap

Product returns are a major challenge for e-businesses as they involve huge logistical and operational costs. Therefore, it becomes crucial to predict returns in advance. This…

HTML

PDF (790 KB)

Downloads

Abstract

Purpose

Product returns are a major challenge for e-businesses as they involve huge logistical and operational costs. Therefore, it becomes crucial to predict returns in advance. This study aims to evaluate different genres of classifiers for product return chance prediction, and further optimizes the best performing model.

Design/methodology/approach

An e-commerce data set having categorical type attributes has been used for this study. Feature selection based on chi-square provides a selective features-set which is used as inputs for model building. Predictive models are attempted using individual classifiers, ensemble models and deep neural networks. For performance evaluation, 75:25 train/test split and 10-fold cross-validation strategies are used. To improve the predictability of the best performing classifier, hyperparameter tuning is performed using different optimization methods such as, random search, grid search, Bayesian approach and evolutionary models (genetic algorithm, differential evolution and particle swarm optimization).

Findings

A comparison of F1-scores revealed that the Bayesian approach outperformed all other optimization approaches in terms of accuracy. The predictability of the Bayesian-optimized model is further compared with that of other classifiers using experimental analysis. The Bayesian-optimized XGBoost model possessed superior performance, with accuracies of 77.80% and 70.35% for holdout and 10-fold cross-validation methods, respectively.

Research limitations/implications

Given the anonymized data, the effects of individual attributes on outcomes could not be investigated in detail. The Bayesian-optimized predictive model may be used in decision support systems, enabling real-time prediction of returns and the implementation of preventive measures.

Originality/value

There are very few reported studies on predicting the chance of order return in e-businesses. To the best of the authors’ knowledge, this study is the first to compare different optimization methods and classifiers, demonstrating the superiority of the Bayesian-optimized XGBoost classification model for returns prediction.

Details

Journal of Systems and Information Technology, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 1328-7265

Keywords

Open Access

Article

Publication date: 7 September 2021

Supervised learning and resampling techniques on DISC personality classification using Twitter information in Bahasa Indonesia

Ema Utami, Irwan Oyong, Suwanto Raharjo, Anggit Dwi Hartanto and Sumarni Adi

Gathering knowledge regarding personality traits has long been the interest of academics and researchers in the fields of psychology and in computer science. Analyzing profile…

HTML

PDF (1.1 MB)

Downloads

3160

Abstract

Purpose

Gathering knowledge regarding personality traits has long been the interest of academics and researchers in the fields of psychology and in computer science. Analyzing profile data from personal social media accounts reduces data collection time, as this method does not require users to fill any questionnaires. A pure natural language processing (NLP) approach can give decent results, and its reliability can be improved by combining it with machine learning (as shown by previous studies).

Design/methodology/approach

In this, cleaning the dataset and extracting relevant potential features “as assessed by psychological experts” are essential, as Indonesians tend to mix formal words, non-formal words, slang and abbreviations when writing social media posts. For this article, raw data were derived from a predefined dominance, influence, stability and conscientious (DISC) quiz website, returning 316,967 tweets from 1,244 Twitter accounts “filtered to include only personal and Indonesian-language accounts”. Using a combination of NLP techniques and machine learning, the authors aim to develop a better approach and more robust model, especially for the Indonesian language.

Findings

The authors find that employing a SMOTETomek re-sampling technique and hyperparameter tuning boosts the model’s performance on formalized datasets by 57% (as measured through the F1-score).

Originality/value

The process of cleaning dataset and extracting relevant potential features assessed by psychological experts from it are essential because Indonesian people tend to mix formal words, non-formal words, slang words and abbreviations when writing tweets. Organic data derived from a predefined DISC quiz website resulting 1244 records of Twitter accounts and 316.967 tweets.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 2634-1964

Keywords

Access

Year

Content type

Earlycite article (85)

1 – 10 of 85