Search results

1 – 10 of 26
Article
Publication date: 30 August 2021

Jinchao Huang

Multi-domain convolutional neural network (MDCNN) model has been widely used in object recognition and tracking in the field of computer vision. However, if the objects to be…

4008

Abstract

Purpose

Multi-domain convolutional neural network (MDCNN) model has been widely used in object recognition and tracking in the field of computer vision. However, if the objects to be tracked move rapid or the appearances of moving objects vary dramatically, the conventional MDCNN model will suffer from the model drift problem. To solve such problem in tracking rapid objects under limiting environment for MDCNN model, this paper proposed an auto-attentional mechanism-based MDCNN (AA-MDCNN) model for the rapid moving and changing objects tracking under limiting environment.

Design/methodology/approach

First, to distinguish the foreground object between background and other similar objects, the auto-attentional mechanism is used to selectively aggregate the weighted summation of all feature maps to make the similar features related to each other. Then, the bidirectional gated recurrent unit (Bi-GRU) architecture is used to integrate all the feature maps to selectively emphasize the importance of the correlated feature maps. Finally, the final feature map is obtained by fusion the above two feature maps for object tracking. In addition, a composite loss function is constructed to solve the similar but different attribute sequences tracking using conventional MDCNN model.

Findings

In order to validate the effectiveness and feasibility of the proposed AA-MDCNN model, this paper used ImageNet-Vid dataset to train the object tracking model, and the OTB-50 dataset is used to validate the AA-MDCNN tracking model. Experimental results have shown that the augmentation of auto-attentional mechanism will improve the accuracy rate 2.75% and success rate 2.41%, respectively. In addition, the authors also selected six complex tracking scenarios in OTB-50 dataset; over eleven attributes have been validated that the proposed AA-MDCNN model outperformed than the comparative models over nine attributes. In addition, except for the scenario of multi-objects moving with each other, the proposed AA-MDCNN model solved the majority rapid moving objects tracking scenarios and outperformed than the comparative models on such complex scenarios.

Originality/value

This paper introduced the auto-attentional mechanism into MDCNN model and adopted Bi-GRU architecture to extract key features. By using the proposed AA-MDCNN model, rapid object tracking under complex background, motion blur and occlusion objects has better effect, and such model is expected to be further applied to the rapid object tracking in the real world.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 15 no. 1
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 17 January 2023

Yueting Yang, Shaolin Hu, Ye Ke and Runguan Zhou

Fire smoke detection in petrochemical plant can prevent fire and ensure production safety and life safety. The purpose of this paper is to solve the problem of missed detection…

Abstract

Purpose

Fire smoke detection in petrochemical plant can prevent fire and ensure production safety and life safety. The purpose of this paper is to solve the problem of missed detection and false detection in flame smoke detection under complex factory background.

Design/methodology/approach

This paper presents a flame smoke detection algorithm based on YOLOv5. The target regression loss function (CIoU) is used to improve the missed detection and false detection in target detection and improve the model detection performance. The improved activation function avoids gradient disappearance to maintain high real-time performance of the algorithm. Data enhancement technology is used to enhance the ability of the network to extract features and improve the accuracy of the model for small target detection.

Findings

Based on the actual situation of flame smoke, the loss function and activation function of YOLOv5 model are improved. Based on the improved YOLOv5 model, a flame smoke detection algorithm with generalization performance is established. The improved model is compared with SSD and YOLOv4-tiny. The accuracy of the improved YOLOv5 model can reach 99.5%, which achieves a more accurate detection effect on flame smoke. The improved network model is superior to the existing methods in running time and accuracy.

Originality/value

Aiming at the actual particularity of flame smoke detection, an improved flame smoke detection network model based on YOLOv5 is established. The purpose of optimizing the model is achieved by improving the loss function, and the activation function with stronger nonlinear ability is combined to avoid over-fitting of the network. This method is helpful to improve the problems of missed detection and false detection in flame smoke detection and can be further extended to pedestrian target detection and vehicle running recognition.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 16 no. 3
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 28 February 2022

Rui Zhang, Na Zhao, Liuhu Fu, Lihu Pan, Xiaolu Bai and Renwang Song

This paper aims to propose a new ultrasonic diagnosis method for stainless steel weld defects based on multi-domain feature fusion to solve two problems in the ultrasonic…

Abstract

Purpose

This paper aims to propose a new ultrasonic diagnosis method for stainless steel weld defects based on multi-domain feature fusion to solve two problems in the ultrasonic diagnosis of austenitic stainless steel weld defects. These are insufficient feature extraction and subjective dependence of diagnosis model parameters.

Design/methodology/approach

To express the richness of the one-dimensional (1D) signal information, the 1D ultrasonic testing signal was derived to the two-dimensional (2D) time-frequency domain. Multi-scale depthwise separable convolution was also designed to optimize the MobileNetV3 network to obtain deep convolution feature information under different receptive fields. At the same time, the time/frequent-domain feature extraction of the defect signals was carried out based on statistical analysis. The defect sensitive features were screened out through visual analysis, and the defect feature set was constructed by cascading fusion with deep convolution feature information. To improve the adaptability and generalization of the diagnostic model, the authors designed and carried out research on the hyperparameter self-optimization of the diagnostic model based on the sparrow search strategy and constructed the optimal hyperparameter combination of the model. Finally, the performance of the ultrasonic diagnosis of stainless steel weld defects was improved comprehensively through the multi-domain feature characterization model of the defect data and diagnosis optimization model.

Findings

The experimental results show that the diagnostic accuracy of the lightweight diagnosis model constructed in this paper can reach 96.55% for the five types of stainless steel weld defects, including cracks, porosity, inclusion, lack of fusion and incomplete penetration. These can meet the needs of practical engineering applications.

Originality/value

This method provides a theoretical basis and technical reference for developing and applying intelligent, efficient and accurate ultrasonic defect diagnosis technology.

Article
Publication date: 14 December 2023

Huaxiang Song, Chai Wei and Zhou Yong

The paper aims to tackle the classification of Remote Sensing Images (RSIs), which presents a significant challenge for computer algorithms due to the inherent characteristics of…

Abstract

Purpose

The paper aims to tackle the classification of Remote Sensing Images (RSIs), which presents a significant challenge for computer algorithms due to the inherent characteristics of clustered ground objects and noisy backgrounds. Recent research typically leverages larger volume models to achieve advanced performance. However, the operating environments of remote sensing commonly cannot provide unconstrained computational and storage resources. It requires lightweight algorithms with exceptional generalization capabilities.

Design/methodology/approach

This study introduces an efficient knowledge distillation (KD) method to build a lightweight yet precise convolutional neural network (CNN) classifier. This method also aims to substantially decrease the training time expenses commonly linked with traditional KD techniques. This approach entails extensive alterations to both the model training framework and the distillation process, each tailored to the unique characteristics of RSIs. In particular, this study establishes a robust ensemble teacher by independently training two CNN models using a customized, efficient training algorithm. Following this, this study modifies a KD loss function to mitigate the suppression of non-target category predictions, which are essential for capturing the inter- and intra-similarity of RSIs.

Findings

This study validated the student model, termed KD-enhanced network (KDE-Net), obtained through the KD process on three benchmark RSI data sets. The KDE-Net surpasses 42 other state-of-the-art methods in the literature published from 2020 to 2023. Compared to the top-ranked method’s performance on the challenging NWPU45 data set, KDE-Net demonstrated a noticeable 0.4% increase in overall accuracy with a significant 88% reduction in parameters. Meanwhile, this study’s reformed KD framework significantly enhances the knowledge transfer speed by at least three times.

Originality/value

This study illustrates that the logit-based KD technique can effectively develop lightweight CNN classifiers for RSI classification without substantial sacrifices in computation and storage costs. Compared to neural architecture search or other methods aiming to provide lightweight solutions, this study’s KDE-Net, based on the inherent characteristics of RSIs, is currently more efficient in constructing accurate yet lightweight classifiers for RSI classification.

Details

International Journal of Web Information Systems, vol. 20 no. 2
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 19 October 2023

Huaxiang Song

Classification of remote sensing images (RSI) is a challenging task in computer vision. Recently, researchers have proposed a variety of creative methods for automatic recognition…

Abstract

Purpose

Classification of remote sensing images (RSI) is a challenging task in computer vision. Recently, researchers have proposed a variety of creative methods for automatic recognition of RSI, and feature fusion is a research hotspot for its great potential to boost performance. However, RSI has a unique imaging condition and cluttered scenes with complicated backgrounds. This larger difference from nature images has made the previous feature fusion methods present insignificant performance improvements.

Design/methodology/approach

This work proposed a two-convolutional neural network (CNN) fusion method named main and branch CNN fusion network (MBC-Net) as an improved solution for classifying RSI. In detail, the MBC-Net employs an EfficientNet-B3 as its main CNN stream and an EfficientNet-B0 as a branch, named MC-B3 and BC-B0, respectively. In particular, MBC-Net includes a long-range derivation (LRD) module, which is specially designed to learn the dependence of different features. Meanwhile, MBC-Net also uses some unique ideas to tackle the problems coming from the two-CNN fusion and the inherent nature of RSI.

Findings

Extensive experiments on three RSI sets prove that MBC-Net outperforms the other 38 state-of-the-art (STOA) methods published from 2020 to 2023, with a noticeable increase in overall accuracy (OA) values. MBC-Net not only presents a 0.7% increased OA value on the most confusing NWPU set but also has 62% fewer parameters compared to the leading approach that ranks first in the literature.

Originality/value

MBC-Net is a more effective and efficient feature fusion approach compared to other STOA methods in the literature. Given the visualizations of grad class activation mapping (Grad-CAM), it reveals that MBC-Net can learn the long-range dependence of features that a single CNN cannot. Based on the tendency stochastic neighbor embedding (t-SNE) results, it demonstrates that the feature representation of MBC-Net is more effective than other methods. In addition, the ablation tests indicate that MBC-Net is effective and efficient for fusing features from two CNNs.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 17 no. 1
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 18 August 2023

Gaurav Sarin, Pradeep Kumar and M. Mukund

Text classification is a widely accepted and adopted technique in organizations to mine and analyze unstructured and semi-structured data. With advancement of technological…

Abstract

Purpose

Text classification is a widely accepted and adopted technique in organizations to mine and analyze unstructured and semi-structured data. With advancement of technological computing, deep learning has become more popular among academicians and professionals to perform mining and analytical operations. In this work, the authors study the research carried out in field of text classification using deep learning techniques to identify gaps and opportunities for doing research.

Design/methodology/approach

The authors adopted bibliometric-based approach in conjunction with visualization techniques to uncover new insights and findings. The authors collected data of two decades from Scopus global database to perform this study. The authors discuss business applications of deep learning techniques for text classification.

Findings

The study provides overview of various publication sources in field of text classification and deep learning together. The study also presents list of prominent authors and their countries working in this field. The authors also presented list of most cited articles based on citations and country of research. Various visualization techniques such as word cloud, network diagram and thematic map were used to identify collaboration network.

Originality/value

The study performed in this paper helped to understand research gaps that is original contribution to body of literature. To best of the authors' knowledge, in-depth study in the field of text classification and deep learning has not been performed in detail. The study provides high value to scholars and professionals by providing them opportunities of research in this area.

Details

Benchmarking: An International Journal, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1463-5771

Keywords

Article
Publication date: 1 May 2020

Qihang Wu, Daifeng Li, Lu Huang and Biyun Ye

Entity relation extraction is an important research direction to obtain structured information. However, most of the current methods are to determine the relations between…

Abstract

Purpose

Entity relation extraction is an important research direction to obtain structured information. However, most of the current methods are to determine the relations between entities in a given sentence based on a stepwise method, seldom considering entities and relations into a unified framework. The joint learning method is an optimal solution that combines relations and entities. This paper aims to optimize hierarchical reinforcement learning framework and provide an efficient model to extract entity relation.

Design/methodology/approach

This paper is based on the hierarchical reinforcement learning framework of joint learning and combines the model with BERT, the best language representation model, to optimize the word embedding and encoding process. Besides, this paper adjusts some punctuation marks to make the data set more standardized, and introduces positional information to improve the performance of the model.

Findings

Experiments show that the model proposed in this paper outperforms the baseline model with a 13% improvement, and achieve 0.742 in F1 score in NYT10 data set. This model can effectively extract entities and relations in large-scale unstructured text and can be applied to the fields of multi-domain information retrieval, intelligent understanding and intelligent interaction.

Originality/value

The research provides an efficient solution for researchers in a different domain to make use of artificial intelligence (AI) technologies to process their unstructured text more accurately.

Details

Information Discovery and Delivery, vol. 48 no. 3
Type: Research Article
ISSN: 2398-6247

Keywords

Article
Publication date: 28 April 2023

Xiaohua Shi, Chen Hao, Ding Yue and Hongtao Lu

Traditional library book recommendation methods are mainly based on association rules and user profiles. They may help to learn about students' interest in different types of…

247

Abstract

Purpose

Traditional library book recommendation methods are mainly based on association rules and user profiles. They may help to learn about students' interest in different types of books, e.g., students majoring in science and engineering tend to pay more attention to computer books. Nevertheless, most of them still need to identify users' interests accurately. To solve the problem, the authors propose a novel embedding-driven model called InFo, which refers to users' intrinsic interests and academic preferences to provide personalized library book recommendations.

Design/methodology/approach

The authors analyze the characteristics and challenges in real library book recommendations and then propose a method considering feature interactions. Specifically, the authors leverage the attention unit to extract students' preferences for different categories of books from their borrowing history, after which we feed the unit into the Factorization Machine with other context-aware features to learn students' hybrid interests. The authors employ a convolution neural network to extract high-order correlations among feature maps which are obtained by the outer product between feature embeddings.

Findings

The authors evaluate the model by conducting experiments on a real-world dataset in one university. The results show that the model outperforms other state-of-the-art methods in terms of two metrics called Recall and NDCG.

Research limitations/implications

It requires a specific data size to prevent overfitting during model training, and the proposed method may face the user/item cold-start challenge.

Practical implications

The embedding-driven book recommendation model could be applied in real libraries to provide valuable recommendations based on readers' preferences.

Originality/value

The proposed method is a practical embedding-driven model that accurately captures diverse user preferences.

Details

Library Hi Tech, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 4 June 2020

Moruf Akin Adebowale, Khin T. Lwin and M. A. Hossain

Phishing attacks have evolved in recent years due to high-tech-enabled economic growth worldwide. The rise in all types of fraud loss in 2019 has been attributed to the increase…

1394

Abstract

Purpose

Phishing attacks have evolved in recent years due to high-tech-enabled economic growth worldwide. The rise in all types of fraud loss in 2019 has been attributed to the increase in deception scams and impersonation, as well as to sophisticated online attacks such as phishing. The global impact of phishing attacks will continue to intensify, and thus, a more efficient phishing detection method is required to protect online user activities. To address this need, this study focussed on the design and development of a deep learning-based phishing detection solution that leveraged the universal resource locator and website content such as images, text and frames.

Design/methodology/approach

Deep learning techniques are efficient for natural language and image classification. In this study, the convolutional neural network (CNN) and the long short-term memory (LSTM) algorithm were used to build a hybrid classification model named the intelligent phishing detection system (IPDS). To build the proposed model, the CNN and LSTM classifier were trained by using 1m universal resource locators and over 10,000 images. Then, the sensitivity of the proposed model was determined by considering various factors such as the type of feature, number of misclassifications and split issues.

Findings

An extensive experimental analysis was conducted to evaluate and compare the effectiveness of the IPDS in detecting phishing web pages and phishing attacks when applied to large data sets. The results showed that the model achieved an accuracy rate of 93.28% and an average detection time of 25 s.

Originality/value

The hybrid approach using deep learning algorithm of both the CNN and LSTM methods was used in this research work. On the one hand, the combination of both CNN and LSTM was used to resolve the problem of a large data set and higher classifier prediction performance. Hence, combining the two methods leads to a better result with less training time for LSTM and CNN architecture, while using the image, frame and text features as a hybrid for our model detection. The hybrid features and IPDS classifier for phishing detection were the novelty of this study to the best of the authors' knowledge.

Details

Journal of Enterprise Information Management, vol. 36 no. 3
Type: Research Article
ISSN: 1741-0398

Keywords

Article
Publication date: 8 June 2020

Ming Li, Ying Li, YingCheng Xu and Li Wang

In community question answering (CQA), people who answer questions assume readers have mastered the content in the answers. Nevertheless, some readers cannot understand all…

Abstract

Purpose

In community question answering (CQA), people who answer questions assume readers have mastered the content in the answers. Nevertheless, some readers cannot understand all content. Thus, there is a need for further explanation of the concepts that appear in the answers. Moreover, the large number of question and answer (Q&A) documents make manual retrieval difficult. This paper aims to alleviate these issues for CQA websites.

Design/methodology/approach

In the paper, an algorithm for recommending explanatory Q&A documents is proposed. Q&A documents are modeled with the biterm topic model (BTM) (Yan et al., 2013). Then, the growing neural gas (GNG) algorithm (Fritzke, 1995) is used to cluster Q&A documents. To train multiple classifiers, three features are extracted from the Q&A categories. Thereafter, an ensemble classification model is constructed to identify the explanatory relationships. Finally, the explanatory Q&A documents are recommended.

Findings

The GNG algorithm shows good clustering performance. The ensemble classification model performs better than other classifiers. The both effect and quality scores of explanatory Q&A recommendations are high. These scores indicate the practicality and good performance of the proposed recommendation algorithm.

Research limitations/implications

The proposed algorithm alleviates information overload in CQA from the new perspective of recommending explanatory knowledge. It provides new insight into research on recommendations in CQA. Moreover, in practice, CQA websites can use it to help retrieve Q&A documents and facilitate understanding of their contents. However, the algorithm is for the general recommendation of Q&A documents which does not consider individual personalized characteristics. In future work, personalized recommendations will be evaluated.

Originality/value

A novel explanatory Q&A recommendation algorithm is proposed for CQA to alleviate the burden of manual retrieval and Q&A overload. The novel GNG clustering algorithm and ensemble classification model provide a more accurate way to identify explanatory Q&A documents. The method of ranking the explanatory Q&A documents improves the effectiveness and quality of the recommendation. The proposed algorithm improves the accuracy and efficiency of retrieving explanatory Q&A documents. It assists users in grasping answers easily.

Details

Data Technologies and Applications, vol. 54 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

1 – 10 of 26