Search results
1 – 10 of over 30000Haonan Fan, Qin Dong and Naixuan Guo
This paper aims to propose a classification method for steel strip surface defects based on a mixed attention mechanism to achieve fast and accurate classification performance…
Abstract
Purpose
This paper aims to propose a classification method for steel strip surface defects based on a mixed attention mechanism to achieve fast and accurate classification performance. The traditional method of classifying surface defects of hot-rolled steel strips has the problems of low recognition accuracy and low efficiency in the industrial complex production environment.
Design/methodology/approach
The authors selected min–max scaling comparison method to filter the training results of multiple network models on the steel strip surface defect data set. Then, the best comprehensive performance model EfficientNet-B0 was refined. Based on this, the authors proposed two mixed attention addition methods, which include squeeze-excitation spatial mixed module and multilayer mixed attention mechanism (MMAM) module, respectively.
Findings
With these two methods, the authors achieved 96.72% and 97.70% recognition accuracy on the steel strip data set after data augmentation for adapting to the complex production environment, respectively. Using the transfer learning method, the EfficientNet-B0 based on MMAM obtained 100% recognition accuracy.
Originality/value
This study not only focuses on improving the recognition accuracy of the network model itself but also considers other performance indicators of the network, which are rarely considered by many researchers. The authors further improve the intelligent production technique and address this issue. Both methods proposed in this paper can be applied to embedded equipment, which can effectively improve steel strip factory production efficiency and reduce material and time loss.
Details
Keywords
Qiang Zhang, Zijian Ye, Siyu Shao, Tianlin Niu and Yuwei Zhao
The current studies on remaining useful life (RUL) prediction mainly rely on convolutional neural networks (CNNs) and long short-term memories (LSTMs) and do not take full…
Abstract
Purpose
The current studies on remaining useful life (RUL) prediction mainly rely on convolutional neural networks (CNNs) and long short-term memories (LSTMs) and do not take full advantage of the attention mechanism, resulting in lack of prediction accuracy. To further improve the performance of the above models, this study aims to propose a novel end-to-end RUL prediction framework, called convolutional recurrent attention network (CRAN) to achieve high accuracy.
Design/methodology/approach
The proposed CRAN is a CNN-LSTM-based model that effectively combines the powerful feature extraction ability of CNN and sequential processing capability of LSTM. The channel attention mechanism, spatial attention mechanism and LSTM attention mechanism are incorporated in CRAN, assigning different attention coefficients to CNN and LSTM. First, features of the bearing vibration data are extracted from both time and frequency domain. Next, the training and testing set are constructed. Then, the CRAN is trained offline using the training set. Finally, online RUL estimation is performed by applying data from the testing set to the trained CRAN.
Findings
CNN-LSTM-based models have higher RUL prediction accuracy than CNN-based and LSTM-based models. Using a combination of max pooling and average pooling can reduce the loss of feature information, and in addition, the structure of the serial attention mechanism is superior to the parallel attention structure. Comparing the proposed CRAN with six different state-of-the-art methods, for the predicted results of two testing bearings, the proposed CRAN has an average reduction in the root mean square error of 57.07/80.25%, an average reduction in the mean absolute error of 62.27/85.87% and an average improvement in score of 12.65/6.57%.
Originality/value
This article provides a novel end-to-end rolling bearing RUL prediction framework, which can provide a reference for the formulation of bearing maintenance programs in the industry.
Details
Keywords
Bin Wang, Fanghong Gao, Le Tong, Qian Zhang and Sulei Zhu
Traffic flow prediction has always been a top priority of intelligent transportation systems. There are many mature methods for short-term traffic flow prediction. However, the…
Abstract
Purpose
Traffic flow prediction has always been a top priority of intelligent transportation systems. There are many mature methods for short-term traffic flow prediction. However, the existing methods are often insufficient in capturing long-term spatial-temporal dependencies. To predict long-term dependencies more accurately, in this paper, a new and more effective traffic flow prediction model is proposed.
Design/methodology/approach
This paper proposes a new and more effective traffic flow prediction model, named channel attention-based spatial-temporal graph neural networks. A graph convolutional network is used to extract local spatial-temporal correlations, a channel attention mechanism is used to enhance the influence of nearby spatial-temporal dependencies on decision-making and a transformer mechanism is used to capture long-term dependencies.
Findings
The proposed model is applied to two common highway datasets: METR-LA collected in Los Angeles and PEMS-BAY collected in the California Bay Area. This model outperforms the other five in terms of performance on three performance metrics a popular model.
Originality/value
(1) Based on the spatial-temporal synchronization graph convolution module, a spatial-temporal channel attention module is designed to increase the influence of proximity dependence on decision-making by enhancing or suppressing different channels. (2) To better capture long-term dependencies, the transformer module is introduced.
Details
Keywords
Miao Tian, Ying Cui, Haixia Long and Junxia Li
In novelty detection, the autoencoder based image reconstruction strategy is one of the mainstream solutions. The basic idea is that once the autoencoder is trained on normal…
Abstract
Purpose
In novelty detection, the autoencoder based image reconstruction strategy is one of the mainstream solutions. The basic idea is that once the autoencoder is trained on normal data, it has a low reconstruction error on normal data. However, when faced with complex natural images, the conventional pixel-level reconstruction becomes poor and does not show the promising results. This paper aims to provide a new method for improving the performance of novelty detection based autoencoder.
Design/methodology/approach
To solve the problem that conventional pixel-level reconstruction cannot effectively extract the global semantic information of the image, a novel model with the combination of attention mechanism and self-supervised learning method is proposed. First, an auxiliary task, reconstruct rotated image, is set to enable the network to learn global semantic feature information. Then, the channel attention mechanism is introduced to perform adaptive feature refinement on the intermediate feature map to optimize the correspondingly passed feature map.
Findings
Experimental results on three public data sets show that the proposed method has potential performance for novelty detection.
Originality/value
This study explores the ability of self-supervised learning methods and attention mechanism to extract features on a single class of images. In this way, the performance of novelty detection can be improved.
Details
Keywords
Oladosu Oyebisi Oladimeji and Ayodeji Olusegun J. Ibitoye
Diagnosing brain tumors is a process that demands a significant amount of time and is heavily dependent on the proficiency and accumulated knowledge of radiologists. Over the…
Abstract
Purpose
Diagnosing brain tumors is a process that demands a significant amount of time and is heavily dependent on the proficiency and accumulated knowledge of radiologists. Over the traditional methods, deep learning approaches have gained popularity in automating the diagnosis of brain tumors, offering the potential for more accurate and efficient results. Notably, attention-based models have emerged as an advanced, dynamically refining and amplifying model feature to further elevate diagnostic capabilities. However, the specific impact of using channel, spatial or combined attention methods of the convolutional block attention module (CBAM) for brain tumor classification has not been fully investigated.
Design/methodology/approach
To selectively emphasize relevant features while suppressing noise, ResNet50 coupled with the CBAM (ResNet50-CBAM) was used for the classification of brain tumors in this research.
Findings
The ResNet50-CBAM outperformed existing deep learning classification methods like convolutional neural network (CNN), ResNet-CBAM achieved a superior performance of 99.43%, 99.01%, 98.7% and 99.25% in accuracy, recall, precision and AUC, respectively, when compared to the existing classification methods using the same dataset.
Practical implications
Since ResNet-CBAM fusion can capture the spatial context while enhancing feature representation, it can be integrated into the brain classification software platforms for physicians toward enhanced clinical decision-making and improved brain tumor classification.
Originality/value
This research has not been published anywhere else.
Details
Keywords
Automatic segmentation of brain tumor from medical images is a challenging task because of tumor's uneven and irregular shapes. In this paper, the authors propose an attention…
Abstract
Purpose
Automatic segmentation of brain tumor from medical images is a challenging task because of tumor's uneven and irregular shapes. In this paper, the authors propose an attention-based nested segmentation network, named DAU-Net. In total, two types of attention mechanisms are introduced to make the U-Net network focus on the key feature regions. The proposed network has a deep supervised encoder–decoder architecture and a redesigned dense skip connection. DAU-Net introduces an attention mechanism between convolutional blocks so that the features extracted at different levels can be merged with a task-related selection.
Design/methodology/approach
In the coding layer, the authors designed a channel attention module. It marks the importance of each feature graph in the segmentation task. In the decoding layer, the authors designed a spatial attention module. It marks the importance of different regional features. And by fusing features at different scales in the same coding layer, the network can fully extract the detailed information of the original image and learn more tumor boundary information.
Findings
To verify the effectiveness of the DAU-Net, experiments were carried out on the BRATS 2018 brain tumor magnetic resonance imaging (MRI) database. The segmentation results show that the proposed method has a high accuracy, with a Dice similarity coefficient (DSC) of 89% in the complete tumor, which is an improvement of 8.04 and 4.02%, compared with fully convolutional network (FCN) and U-Net, respectively.
Originality/value
The experimental results show that the proposed method has good performance in the segmentation of brain tumors. The proposed method has potential clinical applicability.
Details
Keywords
Rui Wang, Shunjie Zhang, Shengqiang Liu, Weidong Liu and Ao Ding
The purpose is using generative adversarial network (GAN) to solve the problem of sample augmentation in the case of imbalanced bearing fault data sets and improving residual…
Abstract
Purpose
The purpose is using generative adversarial network (GAN) to solve the problem of sample augmentation in the case of imbalanced bearing fault data sets and improving residual network is used to improve the diagnostic accuracy of the bearing fault intelligent diagnosis model in the environment of high signal noise.
Design/methodology/approach
A bearing vibration data generation model based on conditional GAN (CGAN) framework is proposed. The method generates data based on the adversarial mechanism of GANs and uses a small number of real samples to generate data, thereby effectively expanding imbalanced data sets. Combined with the data augmentation method based on CGAN, a fault diagnosis model of rolling bearing under the condition of data imbalance based on CGAN and improved residual network with attention mechanism is proposed.
Findings
The method proposed in this paper is verified by the western reserve data set and the truck bearing test bench data set, proving that the CGAN-based data generation method can form a high-quality augmented data set, while the CGAN-based and improved residual with attention mechanism. The diagnostic model of the network has better diagnostic accuracy under low signal-to-noise ratio samples.
Originality/value
A bearing vibration data generation model based on CGAN framework is proposed. The method generates data based on the adversarial mechanism of GAN and uses a small number of real samples to generate data, thereby effectively expanding imbalanced data sets. Combined with the data augmentation method based on CGAN, a fault diagnosis model of rolling bearing under the condition of data imbalance based on CGAN and improved residual network with attention mechanism is proposed.
Details
Keywords
Hui Xu, Junjie Zhang, Hui Sun, Miao Qi and Jun Kong
Attention is one of the most important factors to affect the academic performance of students. Effectively analyzing students' attention in class can promote teachers' precise…
Abstract
Purpose
Attention is one of the most important factors to affect the academic performance of students. Effectively analyzing students' attention in class can promote teachers' precise teaching and students' personalized learning. To intelligently analyze the students' attention in classroom from the first-person perspective, this paper proposes a fusion model based on gaze tracking and object detection. In particular, the proposed attention analysis model does not depend on any smart equipment.
Design/methodology/approach
Given a first-person view video of students' learning, the authors first estimate the gazing point by using the deep space–time neural network. Second, single shot multi-box detector and fast segmentation convolutional neural network are comparatively adopted to accurately detect the objects in the video. Third, they predict the gazing objects by combining the results of gazing point estimation and object detection. Finally, the personalized attention of students is analyzed based on the predicted gazing objects and the measurable eye movement criteria.
Findings
A large number of experiments are carried out on a public database and a new dataset that is built in a real classroom. The experimental results show that the proposed model not only can accurately track the students' gazing trajectory and effectively analyze the fluctuation of attention of the individual student and all students but also provide a valuable reference to evaluate the process of learning of students.
Originality/value
The contributions of this paper can be summarized as follows. The analysis of students' attention plays an important role in improving teaching quality and student achievement. However, there is little research on how to automatically and intelligently analyze students' attention. To alleviate this problem, this paper focuses on analyzing students' attention by gaze tracking and object detection in classroom teaching, which is significant for practical application in the field of education. The authors proposed an effectively intelligent fusion model based on the deep neural network, which mainly includes the gazing point module and the object detection module, to analyze students' attention in classroom teaching instead of relying on any smart wearable device. They introduce the attention mechanism into the gazing point module to improve the performance of gazing point detection and perform some comparison experiments on the public dataset to prove that the gazing point module can achieve better performance. They associate the eye movement criteria with visual gaze to get quantifiable objective data for students' attention analysis, which can provide a valuable basis to evaluate the learning process of students, provide useful learning information of students for both parents and teachers and support the development of individualized teaching. They built a new database that contains the first-person view videos of 11 subjects in a real classroom and employ it to evaluate the effectiveness and feasibility of the proposed model.
Details
Keywords
Zishuo Han, Chunping Wang and Qiang Fu
The purpose of this paper is to use the most popular deep learning algorithm to complete the vehicle detection in the urban area of MiniSAR image, and provide reliable means for…
Abstract
Purpose
The purpose of this paper is to use the most popular deep learning algorithm to complete the vehicle detection in the urban area of MiniSAR image, and provide reliable means for ground monitoring.
Design/methodology/approach
An accurate detector called the rotation region-based convolution neural networks (CNN) with multilayer fusion and multidimensional attention (M2R-Net) is proposed in this paper. Specifically, M2R-Net adopts the multilayer feature fusion strategy to extract feature maps with more extensive information. Next, the authors implement the multidimensional attention network to highlight target areas. Furthermore, a novel balanced sampling strategy for hard and easy positive-negative samples and a global balanced loss function are applied to deal with spatial imbalance and objective imbalance. Finally, rotation anchors are used to predict and calibrate the minimum circumscribed rectangle of vehicles.
Findings
By analyzing many groups of experiments, the validity and universality of the proposed model are verified. More importantly, comparisons with SSD, LRTDet, RFCN, DFPN, CMF-RCNN, R3Det, SCRDet demonstrate that M2R-Net has state-of-the-art detection performance.
Research limitations/implications
The progress in the field of MiniSAR application has been slow due to strong speckle noise, phase error, complex environments and a low signal-to-noise ratio. In addition, four kinds of imbalances, i.e. spatial imbalance, scale imbalance, class imbalance and objective imbalance, in object detection based on the CNN greatly inhibit the optimization of detection performance.
Originality/value
This research can not only enrich the means of daily traffic monitoring but also be used for enemy intelligence reconnaissance in wartime.
Details
Keywords
Guoyang Wan, Yaocong Hu, Bingyou Liu, Shoujun Bai, Kaisheng Xing and Xiuwen Tao
Presently, 6 Degree of Freedom (6DOF) visual pose measurement methods enjoy popularity in the industrial sector. However, challenges persist in accurately measuring the visual…
Abstract
Purpose
Presently, 6 Degree of Freedom (6DOF) visual pose measurement methods enjoy popularity in the industrial sector. However, challenges persist in accurately measuring the visual pose of blank and rough metal casts. Therefore, this paper introduces a 6DOF pose measurement method utilizing stereo vision, and aims to the 6DOF pose measurement of blank and rough metal casts.
Design/methodology/approach
This paper studies the 6DOF pose measurement of metal casts from three aspects: sample enhancement of industrial objects, optimization of detector and attention mechanism. Virtual reality technology is used for sample enhancement of metal casts, which solves the problem of large-scale sample sampling in industrial application. The method also includes a novel deep learning detector that uses multiple key points on the object surface as regression objects to detect industrial objects with rotation characteristics. By introducing a mixed paths attention module, the detection accuracy of the detector and the convergence speed of the training are improved.
Findings
The experimental results show that the proposed method has a better detection effect for metal casts with smaller size scaling and rotation characteristics.
Originality/value
A method for 6DOF pose measurement of industrial objects is proposed, which realizes the pose measurement and grasping of metal blanks and rough machined casts by industrial robots.
Details