Search results
1 – 10 of 98This paper aims to present two different methods to speed up a test used in the sanitary ware industry that requires to count the number of granules that remains in the commodity…
Abstract
Purpose
This paper aims to present two different methods to speed up a test used in the sanitary ware industry that requires to count the number of granules that remains in the commodity after flushing. The test requires that 2,500 granules are added to the lavatory and less than 125 remain.
Design/methodology/approach
The problem is approached using two deep learning computer vision (CV) models. The first model is a Vision Transformers (ViT) classification approach and the second one is a U-Net paired with a connected components algorithm. Both models are trained and evaluated using a proprietary data set of 3,518 labeled images, and performance is compared.
Findings
It was found that both algorithms are able to produce competitive solutions. The U-Net algorithm achieves accuracy levels above 94% and the ViT model reach accuracy levels above 97%. At this time, the U-Net algorithm is being piloted and the ViT pilot is at the planning stage.
Originality/value
To the best of the authors’ knowledge, this is the first approach using CV to solve the granules problem applying ViT. In addition, this work updates the U-Net-Connected components algorithm and compares the results of both algorithms.
Details
Keywords
Worapan Kusakunniran, Pairash Saiviroonporn, Thanongchai Siriapisith, Trongtum Tongdee, Amphai Uraiverotchanakorn, Suphawan Leesakul, Penpitcha Thongnarintr, Apichaya Kuama and Pakorn Yodprom
The cardiomegaly can be determined by the cardiothoracic ratio (CTR) which can be measured in a chest x-ray image. It is calculated based on a relationship between a size of heart…
Abstract
Purpose
The cardiomegaly can be determined by the cardiothoracic ratio (CTR) which can be measured in a chest x-ray image. It is calculated based on a relationship between a size of heart and a transverse dimension of chest. The cardiomegaly is identified when the ratio is larger than a cut-off threshold. This paper aims to propose a solution to calculate the ratio for classifying the cardiomegaly in chest x-ray images.
Design/methodology/approach
The proposed method begins with constructing lung and heart segmentation models based on U-Net architecture using the publicly available datasets with the groundtruth of heart and lung masks. The ratio is then calculated using the sizes of segmented lung and heart areas. In addition, Progressive Growing of GANs (PGAN) is adopted here for constructing the new dataset containing chest x-ray images of three classes including male normal, female normal and cardiomegaly classes. This dataset is then used for evaluating the proposed solution. Also, the proposed solution is used to evaluate the quality of chest x-ray images generated from PGAN.
Findings
In the experiments, the trained models are applied to segment regions of heart and lung in chest x-ray images on the self-collected dataset. The calculated CTR values are compared with the values that are manually measured by human experts. The average error is 3.08%. Then, the models are also applied to segment regions of heart and lung for the CTR calculation, on the dataset computed by PGAN. Then, the cardiomegaly is determined using various attempts of different cut-off threshold values. With the standard cut-off at 0.50, the proposed method achieves 94.61% accuracy, 88.31% sensitivity and 94.20% specificity.
Originality/value
The proposed solution is demonstrated to be robust across unseen datasets for the segmentation, CTR calculation and cardiomegaly classification, including the dataset generated from PGAN. The cut-off value can be adjusted to be lower than 0.50 for increasing the sensitivity. For example, the sensitivity of 97.04% can be achieved at the cut-off of 0.45. However, the specificity is decreased from 94.20% to 79.78%.
Details
Keywords
Sengathir Janakiraman, Deva Priya M., Christy Jeba Malar A., Karthick S. and Anitha Rajakumari P.
The purpose of this paper is to design an Internet-of-Things (IoT) architecture-based Diabetic Retinopathy Detection Scheme (DRDS) proposed for identifying Type-I or Type-II…
Abstract
Purpose
The purpose of this paper is to design an Internet-of-Things (IoT) architecture-based Diabetic Retinopathy Detection Scheme (DRDS) proposed for identifying Type-I or Type-II diabetes and to specifically advise the Type-II diabetic patients about the possibility of vision loss.
Design/methodology/approach
The proposed DRDS includes the benefits of automatic calculation of clip limit parameters and sub-window for making the detection process completely adaptive. It uses the advantages of extended 5 × 5 Sobels operator for estimating the maximum edges determined through the convolution of 24 pixels with eight templates to achieve 24 outputs corresponding to individual pixels for finding the maximum magnitude. It enhances the probability of connecting pixels in the vascular map with its closely located neighbourhood points in the fundus images. Then, the spatial information and kernel of the neighbourhood pixels are integrated through the Robust Semi-supervised Kernelized Fuzzy Local information C-Means Clustering (RSKFL-CMC) method to attain significant clustering process.
Findings
The results of the proposed DRDS architecture confirm the predominance in terms of accuracy, specificity and sensitivity. The proposed DRDS technique facilitates superior performance at an average of 99.64% accuracy, 76.84% sensitivity and 99.93% specificity.
Research limitations/implications
DRDS is proposed as a comfortable, pain-free and harmless diagnosis system using the merits of Dexcom G4 Plantinum sensors for estimating blood glucose level in diabetic patients. It uses the merits of RSKFL-CMC method to estimate the spatial information and kernel of the neighborhood pixels for attaining significant clustering process.
Practical implications
The IoT architecture comprises of the application layer that inherits the DR application enabled Graphical User Interface (GUI) which is combined for processing of fundus images by using MATLAB applications. This layer aids the patients in storing the capture fundus images in the database for future diagnosis.
Social implications
This proposed DRDS method plays a vital role in the detection of DR and categorization based on the intensity of disease into severe, moderate and mild grades. The proposed DRDS is responsible for preventing vision loss of diabetic Type-II patients by accurate and potential detection achieved through the utilization of IoT architecture.
Originality/value
The performance of the proposed scheme with the benchmarked approaches of the literature is implemented using MATLAB R2010a. The complete evaluations of the proposed scheme are conducted using HRF, REVIEW, STARE and DRIVE data sets with subjective quantification provided by the experts for the purpose of potential retinal blood vessel segmentation.
Details
Keywords
Weixin Zhang, Zhao Liu, Yu Song, Yixuan Lu and Zhenping Feng
To improve the speed and accuracy of turbine blade film cooling design process, the most advanced deep learning models were introduced into this study to investigate the most…
Abstract
Purpose
To improve the speed and accuracy of turbine blade film cooling design process, the most advanced deep learning models were introduced into this study to investigate the most suitable define for prediction work. This paper aims to create a generative surrogate model that can be applied on multi-objective optimization problems.
Design/methodology/approach
The latest backbone in the field of computer vision (Swin-Transformer, 2021) was introduced and improved as the surrogate function for prediction of the multi-physics field distribution (film cooling effectiveness, pressure, density and velocity). The basic samples were generated by Latin hypercube sampling method and the numerical method adopt for the calculation was validated experimentally at first. The training and testing samples were calculated at experimental conditions. At last, the surrogate model predicted results were verified by experiment in a linear cascade.
Findings
The results indicated that comparing with the Multi-Scale Pix2Pix Model, the Swin-Transformer U-Net model presented higher accuracy and computing speed on the prediction of contour results. The computation time for each step of the Swin-Transformer U-Net model is one-third of the original model, especially in the case of multi-physics field prediction. The correlation index reached more than 99.2% and the first-order error was lower than 0.3% for multi-physics field. The predictions of the data-driven surrogate model are consistent with the predictions of the computational fluid dynamics results, and both are very close to the experimental results. The application of the Swin-Transformer model on enlarging the different structure samples will reduce the cost of numerical calculations as well as experiments.
Research limitations/implications
The number of U-Net layers and sample scales has a proper relationship according to equation (8). Too many layers of U-Net will lead to unnecessary nonlinear variation, whereas too few layers will lead to insufficient feature extraction. In the case of Swin-Transformer U-Net model, incorrect number of U-Net layer will reduce the prediction accuracy. The multi-scale Pix2Pix model owns higher accuracy in predicting a single physical field, but the calculation speed is too slow. The Swin-Transformer model is fast in prediction and training (nearly three times faster than multi Pix2Pix model), but the predicted contours have more noise. The neural network predicted results and numerical calculations are consistent with the experimental distribution.
Originality/value
This paper creates a generative surrogate model that can be applied on multi-objective optimization problems. The generative adversarial networks using new backbone is chosen to adjust the output from single contour to multi-physics fields, which will generate more results simultaneously than traditional surrogate models and reduce the time-cost. And it is more applicable to multi-objective spatial optimization algorithms. The Swin-Transformer surrogate model is three times faster to computation speed than the Multi Pix2Pix model. In the prediction results of multi-physics fields, the prediction results of the Swin-Transformer model are more accurate.
Details
Keywords
Automatic segmentation of brain tumor from medical images is a challenging task because of tumor's uneven and irregular shapes. In this paper, the authors propose an…
Abstract
Purpose
Automatic segmentation of brain tumor from medical images is a challenging task because of tumor's uneven and irregular shapes. In this paper, the authors propose an attention-based nested segmentation network, named DAU-Net. In total, two types of attention mechanisms are introduced to make the U-Net network focus on the key feature regions. The proposed network has a deep supervised encoder–decoder architecture and a redesigned dense skip connection. DAU-Net introduces an attention mechanism between convolutional blocks so that the features extracted at different levels can be merged with a task-related selection.
Design/methodology/approach
In the coding layer, the authors designed a channel attention module. It marks the importance of each feature graph in the segmentation task. In the decoding layer, the authors designed a spatial attention module. It marks the importance of different regional features. And by fusing features at different scales in the same coding layer, the network can fully extract the detailed information of the original image and learn more tumor boundary information.
Findings
To verify the effectiveness of the DAU-Net, experiments were carried out on the BRATS 2018 brain tumor magnetic resonance imaging (MRI) database. The segmentation results show that the proposed method has a high accuracy, with a Dice similarity coefficient (DSC) of 89% in the complete tumor, which is an improvement of 8.04 and 4.02%, compared with fully convolutional network (FCN) and U-Net, respectively.
Originality/value
The experimental results show that the proposed method has good performance in the segmentation of brain tumors. The proposed method has potential clinical applicability.
Details
Keywords
Ruohan Gong and Zuqi Tang
This paper aims to investigate the approach combine the deep learning (DL) and finite element method for the magneto-thermal coupled problem.
Abstract
Purpose
This paper aims to investigate the approach combine the deep learning (DL) and finite element method for the magneto-thermal coupled problem.
Design/methodology/approach
To achieve the DL of electrical device with the hypothesis of a small dataset, with ground truth data obtained from the FEM analysis, U-net, a highly efficient convolutional neural network (CNN) is used to extract hidden features and trained in a supervised manner to predict the magneto-thermal coupled analysis results for different topologies. Using part of the FEM results as training samples, the DL model obtained from effective off-line training can be used to predict the distribution of the magnetic field and temperature field of other cases.
Findings
The possibility and feasibility of the proposed approach are investigated by discussing the influence of various network parameters, in particular, the four most important factors are training sample size, learning rate, batch size and optimization algorithm respectively. It is shown that DL based on U-net can be used as an efficiency tool in multi-physics analysis and achieve good performance with only small datasets.
Originality/value
It is shown that DL based on U-net can be used as an efficiency tool in multi-physics analysis and achieve good performance with only small datasets.
Details
Keywords
Priya Mishra and Aleena Swetapadma
Sleep arousal detection is an important factor to monitor the sleep disorder.
Abstract
Purpose
Sleep arousal detection is an important factor to monitor the sleep disorder.
Design/methodology/approach
Thus, a unique nth layer one-dimensional (1D) convolutional neural network-based U-Net model for automatic sleep arousal identification has been proposed.
Findings
The proposed method has achieved area under the precision–recall curve performance score of 0.498 and area under the receiver operating characteristics performance score of 0.946.
Originality/value
No other researchers have suggested U-Net-based detection of sleep arousal.
Research limitations/implications
From the experimental results, it has been found that U-Net performs better accuracy as compared to the state-of-the-art methods.
Practical implications
Sleep arousal detection is an important factor to monitor the sleep disorder. Objective of the work is to detect the sleep arousal using different physiological channels of human body.
Social implications
It will help in improving mental health by monitoring a person's sleep.
Details
Keywords
Md Sakib Ullah Sourav, Huidong Wang, Mohammad Raziuddin Chowdhury and Rejwan Bin Sulaiman
One of the most neglected sources of energy loss is streetlights that generate too much light in areas where it is not required. Energy waste has enormous economic and…
Abstract
One of the most neglected sources of energy loss is streetlights that generate too much light in areas where it is not required. Energy waste has enormous economic and environmental effects. In addition, due to the conventional manual nature of operation, streetlights are frequently seen being turned ‘ON’ during the day and ‘OFF’ in the evening, which is regrettable even in the twenty-first century. These issues require automated streetlight control in order to be resolved. This study aims to develop a novel streetlight controlling method by combining a smart transport monitoring system powered by computer vision technology with a closed circuit television (CCTV) camera that allows the light-emitting diode (LED) streetlight to automatically light up with the appropriate brightness by detecting the presence of pedestrians or vehicles and dimming the streetlight in their absence using semantic image segmentation from the CCTV video streaming. Consequently, our model distinguishes daylight and nighttime, which made it feasible to automate the process of turning the streetlight ‘ON’ and ‘OFF’ to save energy consumption costs. According to the aforementioned approach, geo-location sensor data could be utilised to make more informed streetlight management decisions. To complete the tasks, we consider training the U-net model with ResNet-34 as its backbone. Validity of the models is guaranteed with the use of assessment matrices. The suggested concept is straightforward, economical, energy-efficient, long-lasting and more resilient than conventional alternatives.
Details
Keywords
Jun Liu, Junyuan Dong, Mingming Hu and Xu Lu
Existing Simultaneous Localization and Mapping (SLAM) algorithms have been relatively well developed. However, when in complex dynamic environments, the movement of the dynamic…
Abstract
Purpose
Existing Simultaneous Localization and Mapping (SLAM) algorithms have been relatively well developed. However, when in complex dynamic environments, the movement of the dynamic points on the dynamic objects in the image in the mapping can have an impact on the observation of the system, and thus there will be biases and errors in the position estimation and the creation of map points. The aim of this paper is to achieve more accurate accuracy in SLAM algorithms compared to traditional methods through semantic approaches.
Design/methodology/approach
In this paper, the semantic segmentation of dynamic objects is realized based on U-Net semantic segmentation network, followed by motion consistency detection through motion detection method to determine whether the segmented objects are moving in the current scene or not, and combined with the motion compensation method to eliminate dynamic points and compensate for the current local image, so as to make the system robust.
Findings
Experiments comparing the effect of detecting dynamic points and removing outliers are conducted on a dynamic data set of Technische Universität München, and the results show that the absolute trajectory accuracy of this paper's method is significantly improved compared with ORB-SLAM3 and DS-SLAM.
Originality/value
In this paper, in the semantic segmentation network part, the segmentation mask is combined with the method of dynamic point detection, elimination and compensation, which reduces the influence of dynamic objects, thus effectively improving the accuracy of localization in dynamic environments.
Details
Keywords
Faris Elghaish, Sandra Matarneh, Essam Abdellatef, Farzad Rahimian, M. Reza Hosseini and Ahmed Farouk Kineber
Cracks are prevalent signs of pavement distress found on highways globally. The use of artificial intelligence (AI) and deep learning (DL) for crack detection is increasingly…
Abstract
Purpose
Cracks are prevalent signs of pavement distress found on highways globally. The use of artificial intelligence (AI) and deep learning (DL) for crack detection is increasingly considered as an optimal solution. Consequently, this paper introduces a novel, fully connected, optimised convolutional neural network (CNN) model using feature selection algorithms for the purpose of detecting cracks in highway pavements.
Design/methodology/approach
To enhance the accuracy of the CNN model for crack detection, the authors employed a fully connected deep learning layers CNN model along with several optimisation techniques. Specifically, three optimisation algorithms, namely adaptive moment estimation (ADAM), stochastic gradient descent with momentum (SGDM), and RMSProp, were utilised to fine-tune the CNN model and enhance its overall performance. Subsequently, the authors implemented eight feature selection algorithms to further improve the accuracy of the optimised CNN model. These feature selection techniques were thoughtfully selected and systematically applied to identify the most relevant features contributing to crack detection in the given dataset. Finally, the authors subjected the proposed model to testing against seven pre-trained models.
Findings
The study's results show that the accuracy of the three optimisers (ADAM, SGDM, and RMSProp) with the five deep learning layers model is 97.4%, 98.2%, and 96.09%, respectively. Following this, eight feature selection algorithms were applied to the five deep learning layers to enhance accuracy, with particle swarm optimisation (PSO) achieving the highest F-score at 98.72. The model was then compared with other pre-trained models and exhibited the highest performance.
Practical implications
With an achieved precision of 98.19% and F-score of 98.72% using PSO, the developed model is highly accurate and effective in detecting and evaluating the condition of cracks in pavements. As a result, the model has the potential to significantly reduce the effort required for crack detection and evaluation.
Originality/value
The proposed method for enhancing CNN model accuracy in crack detection stands out for its unique combination of optimisation algorithms (ADAM, SGDM, and RMSProp) with systematic application of multiple feature selection techniques to identify relevant crack detection features and comparing results with existing pre-trained models.
Details