Search results

1 – 10 of 188
Article
Publication date: 18 August 2023

Gaurav Sarin, Pradeep Kumar and M. Mukund

Text classification is a widely accepted and adopted technique in organizations to mine and analyze unstructured and semi-structured data. With advancement of technological…

Abstract

Purpose

Text classification is a widely accepted and adopted technique in organizations to mine and analyze unstructured and semi-structured data. With advancement of technological computing, deep learning has become more popular among academicians and professionals to perform mining and analytical operations. In this work, the authors study the research carried out in field of text classification using deep learning techniques to identify gaps and opportunities for doing research.

Design/methodology/approach

The authors adopted bibliometric-based approach in conjunction with visualization techniques to uncover new insights and findings. The authors collected data of two decades from Scopus global database to perform this study. The authors discuss business applications of deep learning techniques for text classification.

Findings

The study provides overview of various publication sources in field of text classification and deep learning together. The study also presents list of prominent authors and their countries working in this field. The authors also presented list of most cited articles based on citations and country of research. Various visualization techniques such as word cloud, network diagram and thematic map were used to identify collaboration network.

Originality/value

The study performed in this paper helped to understand research gaps that is original contribution to body of literature. To best of the authors' knowledge, in-depth study in the field of text classification and deep learning has not been performed in detail. The study provides high value to scholars and professionals by providing them opportunities of research in this area.

Details

Benchmarking: An International Journal, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1463-5771

Keywords

Article
Publication date: 23 November 2023

Konstantinos Kalodanis, Panagiotis Rizomiliotis and Dimosthenis Anagnostopoulos

The purpose of this paper is to highlight the key technical challenges that derive from the recently proposed European Artificial Intelligence Act and specifically, to investigate…

Abstract

Purpose

The purpose of this paper is to highlight the key technical challenges that derive from the recently proposed European Artificial Intelligence Act and specifically, to investigate the applicability of the requirements that the AI Act mandates to high-risk AI systems from the perspective of AI security.

Design/methodology/approach

This paper presents the main points of the proposed AI Act, with emphasis on the compliance requirements of high-risk systems. It matches known AI security threats with the relevant technical requirements, it demonstrates the impact that these security threats can have to the AI Act technical requirements and evaluates the applicability of these requirements based on the effectiveness of the existing security protection measures. Finally, the paper highlights the necessity for an integrated framework for AI system evaluation.

Findings

The findings of the EU AI Act technical assessment highlight the gap between the proposed requirements and the available AI security countermeasures as well as the necessity for an AI security evaluation framework.

Originality/value

AI Act, high-risk AI systems, security threats, security countermeasures.

Details

Information & Computer Security, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2056-4961

Keywords

Article
Publication date: 28 March 2024

Elisa Gonzalez Santacruz, David Romero, Julieta Noguez and Thorsten Wuest

This research paper aims to analyze the scientific and grey literature on Quality 4.0 and zero-defect manufacturing (ZDM) frameworks to develop an integrated quality 4.0 framework…

Abstract

Purpose

This research paper aims to analyze the scientific and grey literature on Quality 4.0 and zero-defect manufacturing (ZDM) frameworks to develop an integrated quality 4.0 framework (IQ4.0F) for quality improvement (QI) based on Six Sigma and machine learning (ML) techniques towards ZDM. The IQ4.0F aims to contribute to the advancement of defect prediction approaches in diverse manufacturing processes. Furthermore, the work enables a comprehensive analysis of process variables influencing product quality with emphasis on the use of supervised and unsupervised ML techniques in Six Sigma’s DMAIC (Define, Measure, Analyze, Improve and Control) cycle stage of “Analyze.”

Design/methodology/approach

The research methodology employed a systematic literature review (SLR) based on PRISMA guidelines to develop the integrated framework, followed by a real industrial case study set in the automotive industry to fulfill the objectives of verifying and validating the proposed IQ4.0F with primary data.

Findings

This research work demonstrates the value of a “stepwise framework” to facilitate a shift from conventional quality management systems (QMSs) to QMSs 4.0. It uses the IDEF0 modeling methodology and Six Sigma’s DMAIC cycle to structure the steps to be followed to adopt the Quality 4.0 paradigm for QI. It also proves the worth of integrating Six Sigma and ML techniques into the “Analyze” stage of the DMAIC cycle for improving defect prediction in manufacturing processes and supporting problem-solving activities for quality managers.

Originality/value

This research paper introduces a first-of-its-kind Quality 4.0 framework – the IQ4.0F. Each step of the IQ4.0F was verified and validated in an original industrial case study set in the automotive industry. It is the first Quality 4.0 framework, according to the SLR conducted, to utilize the principal component analysis technique as a substitute for “Screening Design” in the Design of Experiments phase and K-means clustering technique for multivariable analysis, identifying process parameters that significantly impact product quality. The proposed IQ4.0F not only empowers decision-makers with the knowledge to launch a Quality 4.0 initiative but also provides quality managers with a systematic problem-solving methodology for quality improvement.

Details

The TQM Journal, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1754-2731

Keywords

Open Access
Article
Publication date: 18 March 2022

Loris Nanni, Alessandra Lumini and Sheryl Brahnam

Automatic anatomical therapeutic chemical (ATC) classification is progressing at a rapid pace because of its potential in drug development. Predicting an unknown compound's…

Abstract

Purpose

Automatic anatomical therapeutic chemical (ATC) classification is progressing at a rapid pace because of its potential in drug development. Predicting an unknown compound's therapeutic and chemical characteristics in terms of how it affects multiple organs and physiological systems makes automatic ATC classification a vital yet challenging multilabel problem. The aim of this paper is to experimentally derive an ensemble of different feature descriptors and classifiers for ATC classification that outperforms the state-of-the-art.

Design/methodology/approach

The proposed method is an ensemble generated by the fusion of neural networks (i.e. a tabular model and long short-term memory networks (LSTM)) and multilabel classifiers based on multiple linear regression (hMuLab). All classifiers are trained on three sets of descriptors. Features extracted from the trained LSTMs are also fed into hMuLab. Evaluations of ensembles are compared on a benchmark data set of 3883 ATC-coded pharmaceuticals taken from KEGG, a publicly available drug databank.

Findings

Experiments demonstrate the power of the authors’ best ensemble, EnsATC, which is shown to outperform the best methods reported in the literature, including the state-of-the-art developed by the fast.ai research group. The MATLAB source code of the authors’ system is freely available to the public at https://github.com/LorisNanni/Neural-networks-for-anatomical-therapeutic-chemical-ATC-classification.

Originality/value

This study demonstrates the power of extracting LSTM features and combining them with ATC descriptors in ensembles for ATC classification.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 3 January 2023

Saleem Raja A., Sundaravadivazhagan Balasubaramanian, Pradeepa Ganesan, Justin Rajasekaran and Karthikeyan R.

The internet has completely merged into contemporary life. People are addicted to using internet services for everyday activities. Consequently, an abundance of information about…

Abstract

Purpose

The internet has completely merged into contemporary life. People are addicted to using internet services for everyday activities. Consequently, an abundance of information about people and organizations is available online, which encourages the proliferation of cybercrimes. Cybercriminals often use malicious links for large-scale cyberattacks, which are disseminated via email, SMS and social media. Recognizing malicious links online can be exceedingly challenging. The purpose of this paper is to present a strong security system that can detect malicious links in the cyberspace using natural language processing technique.

Design/methodology/approach

The researcher recommends a variety of approaches, including blacklisting and rules-based machine/deep learning, for automatically recognizing malicious links. But the approaches generally necessitate the generation of a set of features to generalize the detection process. Most of the features are generated by processing URLs and content of the web page, as well as some external features such as the ranking of the web page and domain name system information. This process of feature extraction and selection typically takes more time and demands a high level of expertise in the domain. Sometimes the generated features may not leverage the full potentials of the data set. In addition, the majority of the currently deployed systems make use of a single classifier for the classification of malicious links. However, prediction accuracy may vary widely depending on the data set and the classifier used.

Findings

To address the issue of generating feature sets, the proposed method uses natural language processing techniques (term frequency and inverse document frequency) that vectorize URLs. To build a robust system for the classification of malicious links, the proposed system implements weighted soft voting classifier, an ensemble classifier that combines predictions of base classifiers. The ability or skill of each classifier serves as the base for the weight that is assigned to it.

Originality/value

The proposed method performs better when the optimal weights are assigned. The performance of the proposed method was assessed by using two different data sets (D1 and D2) and compared performance against base machine learning classifiers and previous research results. The outcome accuracy shows that the proposed method is superior to the existing methods, offering 91.4% and 98.8% accuracy for data sets D1 and D2, respectively.

Details

International Journal of Pervasive Computing and Communications, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 9 January 2024

Ning Chen, Zhenyu Zhang and An Chen

Consequence prediction is an emerging topic in safety management concerning the severity outcome of accidents. In practical applications, it is usually implemented through…

Abstract

Purpose

Consequence prediction is an emerging topic in safety management concerning the severity outcome of accidents. In practical applications, it is usually implemented through supervised learning methods; however, the evaluation of classification results remains a challenge. The previous studies mostly adopted simplex evaluation based on empirical and quantitative assessment strategies. This paper aims to shed new light on the comprehensive evaluation and comparison of diverse classification methods through visualization, clustering and ranking techniques.

Design/methodology/approach

An empirical study is conducted using 9 state-of-the-art classification methods on a real-world data set of 653 construction accidents in China for predicting the consequence with respect to 39 carefully featured factors and accident type. The proposed comprehensive evaluation enriches the interpretation of classification results from different perspectives. Furthermore, the critical factors leading to severe construction accidents are identified by analyzing the coefficients of a logistic regression model.

Findings

This paper identifies the critical factors that significantly influence the consequence of construction accidents, which include accident type (particularly collapse), improper accident reporting and handling (E21), inadequate supervision engineers (O41), no special safety department (O11), delayed or low-quality drawings (T11), unqualified contractor (C21), schedule pressure (C11), multi-level subcontracting (C22), lacking safety examination (S22), improper operation of mechanical equipment (R11) and improper construction procedure arrangement (T21). The prediction models and findings of critical factors help make safety intervention measures in a targeted way and enhance the experience of safety professionals in the construction industry.

Research limitations/implications

The empirical study using some well-known classification methods for forecasting the consequences of construction accidents provides some evidence for the comprehensive evaluation of multiple classifiers. These techniques can be used jointly with other evaluation approaches for a comprehensive understanding of the classification algorithms. Despite the limitation of specific methods used in the study, the presented methodology can be configured with other classification methods and performance metrics and even applied to other decision-making problems such as clustering.

Originality/value

This study sheds new light on the comprehensive comparison and evaluation of classification results through visualization, clustering and ranking techniques using an empirical study of consequence prediction of construction accidents. The relevance of construction accident type is discussed with the severity of accidents. The critical factors influencing the accident consequence are identified for the sake of taking prevention measures for risk reduction. The proposed method can be applied to other decision-making tasks where the evaluation is involved as an important component.

Details

Construction Innovation , vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1471-4175

Keywords

Article
Publication date: 19 December 2023

Jinchao Huang

Single-shot multi-category clothing recognition and retrieval play a crucial role in online searching and offline settlement scenarios. Existing clothing recognition methods based…

Abstract

Purpose

Single-shot multi-category clothing recognition and retrieval play a crucial role in online searching and offline settlement scenarios. Existing clothing recognition methods based on RGBD clothing images often suffer from high-dimensional feature representations, leading to compromised performance and efficiency.

Design/methodology/approach

To address this issue, this paper proposes a novel method called Manifold Embedded Discriminative Feature Selection (MEDFS) to select global and local features, thereby reducing the dimensionality of the feature representation and improving performance. Specifically, by combining three global features and three local features, a low-dimensional embedding is constructed to capture the correlations between features and categories. The MEDFS method designs an optimization framework utilizing manifold mapping and sparse regularization to achieve feature selection. The optimization objective is solved using an alternating iterative strategy, ensuring convergence.

Findings

Empirical studies conducted on a publicly available RGBD clothing image dataset demonstrate that the proposed MEDFS method achieves highly competitive clothing classification performance while maintaining efficiency in clothing recognition and retrieval.

Originality/value

This paper introduces a novel approach for multi-category clothing recognition and retrieval, incorporating the selection of global and local features. The proposed method holds potential for practical applications in real-world clothing scenarios.

Details

International Journal of Intelligent Computing and Cybernetics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 31 August 2023

Faisal Mehraj Wani, Jayaprakash Vemuri and Rajaram Chenna

Near-fault pulse-like ground motions have distinct and very severe effects on reinforced concrete (RC) structures. However, there is a paucity of recorded data from Near-Fault…

Abstract

Purpose

Near-fault pulse-like ground motions have distinct and very severe effects on reinforced concrete (RC) structures. However, there is a paucity of recorded data from Near-Fault Ground Motions (NFGMs), and thus forecasting the dynamic seismic response of structures, using conventional techniques, under such intense ground motions has remained a challenge.

Design/methodology/approach

The present study utilizes a 2D finite element model of an RC structure subjected to near-fault pulse-like ground motions with a focus on the storey drift ratio (SDR) as the key demand parameter. Five machine learning classifiers (MLCs), namely decision tree, k-nearest neighbor, random forest, support vector machine and Naïve Bayes classifier , were evaluated to classify the damage states of the RC structure.

Findings

The results such as confusion matrix, accuracy and mean square error indicate that the Naïve Bayes classifier model outperforms other MLCs with 80.0% accuracy. Furthermore, three MLC models with accuracy greater than 75% were trained using a voting classifier to enhance the performance score of the models. Finally, a sensitivity analysis was performed to evaluate the model's resilience and dependability.

Originality/value

The objective of the current study is to predict the nonlinear storey drift demand for low-rise RC structures using machine learning techniques, instead of labor-intensive nonlinear dynamic analysis.

Details

International Journal of Structural Integrity, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1757-9864

Keywords

Article
Publication date: 10 November 2023

Yong Gui and Lanxin Zhang

Influenced by the constantly changing manufacturing environment, no single dispatching rule (SDR) can consistently obtain better scheduling results than other rules for the…

Abstract

Purpose

Influenced by the constantly changing manufacturing environment, no single dispatching rule (SDR) can consistently obtain better scheduling results than other rules for the dynamic job-shop scheduling problem (DJSP). Although the dynamic SDR selection classifier (DSSC) mined by traditional data-mining-based scheduling method has shown some improvement in comparison to an SDR, the enhancement is not significant since the rule selected by DSSC is still an SDR.

Design/methodology/approach

This paper presents a novel data-mining-based scheduling method for the DJSP with machine failure aiming at minimizing the makespan. Firstly, a scheduling priority relation model (SPRM) is constructed to determine the appropriate priority relation between two operations based on the production system state and the difference between their priority values calculated using multiple SDRs. Subsequently, a training sample acquisition mechanism based on the optimal scheduling schemes is proposed to acquire training samples for the SPRM. Furthermore, feature selection and machine learning are conducted using the genetic algorithm and extreme learning machine to mine the SPRM.

Findings

Results from numerical experiments demonstrate that the SPRM, mined by the proposed method, not only achieves better scheduling results in most manufacturing environments but also maintains a higher level of stability in diverse manufacturing environments than an SDR and the DSSC.

Originality/value

This paper constructs a SPRM and mines it based on data mining technologies to obtain better results than an SDR and the DSSC in various manufacturing environments.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0368-492X

Keywords

Open Access
Article
Publication date: 4 May 2021

Loris Nanni and Sheryl Brahnam

Automatic DNA-binding protein (DNA-BP) classification is now an essential proteomic technology. Unfortunately, many systems reported in the literature are tested on only one or…

1346

Abstract

Purpose

Automatic DNA-binding protein (DNA-BP) classification is now an essential proteomic technology. Unfortunately, many systems reported in the literature are tested on only one or two datasets/tasks. The purpose of this study is to create the most optimal and universal system for DNA-BP classification, one that performs competitively across several DNA-BP classification tasks.

Design/methodology/approach

Efficient DNA-BP classifier systems require the discovery of powerful protein representations and feature extraction methods. Experiments were performed that combined and compared descriptors extracted from state-of-the-art matrix/image protein representations. These descriptors were trained on separate support vector machines (SVMs) and evaluated. Convolutional neural networks with different parameter settings were fine-tuned on two matrix representations of proteins. Decisions were fused with the SVMs using the weighted sum rule and evaluated to experimentally derive the most powerful general-purpose DNA-BP classifier system.

Findings

The best ensemble proposed here produced comparable, if not superior, classification results on a broad and fair comparison with the literature across four different datasets representing a variety of DNA-BP classification tasks, thereby demonstrating both the power and generalizability of the proposed system.

Originality/value

Most DNA-BP methods proposed in the literature are only validated on one (rarely two) datasets/tasks. In this work, the authors report the performance of our general-purpose DNA-BP system on four datasets representing different DNA-BP classification tasks. The excellent results of the proposed best classifier system demonstrate the power of the proposed approach. These results can now be used for baseline comparisons by other researchers in the field.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

1 – 10 of 188