Search results
1 – 10 of 26Johnny Kwok Wai Wong, Mojtaba Maghrebi, Alireza Ahmadian Fard Fini, Mohammad Amin Alizadeh Golestani, Mahdi Ahmadnia and Michael Er
Images taken from construction site interiors often suffer from low illumination and poor natural colors, which restrict their application for high-level site management purposes…
Abstract
Purpose
Images taken from construction site interiors often suffer from low illumination and poor natural colors, which restrict their application for high-level site management purposes. The state-of-the-art low-light image enhancement method provides promising image enhancement results. However, they generally require a longer execution time to complete the enhancement. This study aims to develop a refined image enhancement approach to improve execution efficiency and performance accuracy.
Design/methodology/approach
To develop the refined illumination enhancement algorithm named enhanced illumination quality (EIQ), a quadratic expression was first added to the initial illumination map. Subsequently, an adjusted weight matrix was added to improve the smoothness of the illumination map. A coordinated descent optimization algorithm was then applied to minimize the processing time. Gamma correction was also applied to further enhance the illumination map. Finally, a frame comparing and averaging method was used to identify interior site progress.
Findings
The proposed refined approach took around 4.36–4.52 s to achieve the expected results while outperforming the current low-light image enhancement method. EIQ demonstrated a lower lightness-order error and provided higher object resolution in enhanced images. EIQ also has a higher structural similarity index and peak-signal-to-noise ratio, which indicated better image reconstruction performance.
Originality/value
The proposed approach provides an alternative to shorten the execution time, improve equalization of the illumination map and provide a better image reconstruction. The approach could be applied to low-light video enhancement tasks and other dark or poor jobsite images for object detection processes.
Details
Keywords
Juan Yang, Zhenkun Li and Xu Du
Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their…
Abstract
Purpose
Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their emotional states in daily communication. Therefore, how to achieve automatic and accurate audiovisual emotion recognition is significantly important for developing engaging and empathetic human–computer interaction environment. However, two major challenges exist in the field of audiovisual emotion recognition: (1) how to effectively capture representations of each single modality and eliminate redundant features and (2) how to efficiently integrate information from these two modalities to generate discriminative representations.
Design/methodology/approach
A novel key-frame extraction-based attention fusion network (KE-AFN) is proposed for audiovisual emotion recognition. KE-AFN attempts to integrate key-frame extraction with multimodal interaction and fusion to enhance audiovisual representations and reduce redundant computation, filling the research gaps of existing approaches. Specifically, the local maximum–based content analysis is designed to extract key-frames from videos for the purpose of eliminating data redundancy. Two modules, including “Multi-head Attention-based Intra-modality Interaction Module” and “Multi-head Attention-based Cross-modality Interaction Module”, are proposed to mine and capture intra- and cross-modality interactions for further reducing data redundancy and producing more powerful multimodal representations.
Findings
Extensive experiments on two benchmark datasets (i.e. RAVDESS and CMU-MOSEI) demonstrate the effectiveness and rationality of KE-AFN. Specifically, (1) KE-AFN is superior to state-of-the-art baselines for audiovisual emotion recognition. (2) Exploring the supplementary and complementary information of different modalities can provide more emotional clues for better emotion recognition. (3) The proposed key-frame extraction strategy can enhance the performance by more than 2.79 per cent on accuracy. (4) Both exploring intra- and cross-modality interactions and employing attention-based audiovisual fusion can lead to better prediction performance.
Originality/value
The proposed KE-AFN can support the development of engaging and empathetic human–computer interaction environment.
Details
Keywords
Ambica Ghai, Pradeep Kumar and Samrat Gupta
Web users rely heavily on online content make decisions without assessing the veracity of the content. The online content comprising text, image, video or audio may be tampered…
Abstract
Purpose
Web users rely heavily on online content make decisions without assessing the veracity of the content. The online content comprising text, image, video or audio may be tampered with to influence public opinion. Since the consumers of online information (misinformation) tend to trust the content when the image(s) supplement the text, image manipulation software is increasingly being used to forge the images. To address the crucial problem of image manipulation, this study focusses on developing a deep-learning-based image forgery detection framework.
Design/methodology/approach
The proposed deep-learning-based framework aims to detect images forged using copy-move and splicing techniques. The image transformation technique aids the identification of relevant features for the network to train effectively. After that, the pre-trained customized convolutional neural network is used to train on the public benchmark datasets, and the performance is evaluated on the test dataset using various parameters.
Findings
The comparative analysis of image transformation techniques and experiments conducted on benchmark datasets from a variety of socio-cultural domains establishes the effectiveness and viability of the proposed framework. These findings affirm the potential applicability of proposed framework in real-time image forgery detection.
Research limitations/implications
This study bears implications for several important aspects of research on image forgery detection. First this research adds to recent discussion on feature extraction and learning for image forgery detection. While prior research on image forgery detection, hand-crafted the features, the proposed solution contributes to stream of literature that automatically learns the features and classify the images. Second, this research contributes to ongoing effort in curtailing the spread of misinformation using images. The extant literature on spread of misinformation has prominently focussed on textual data shared over social media platforms. The study addresses the call for greater emphasis on the development of robust image transformation techniques.
Practical implications
This study carries important practical implications for various domains such as forensic sciences, media and journalism where image data is increasingly being used to make inferences. The integration of image forgery detection tools can be helpful in determining the credibility of the article or post before it is shared over the Internet. The content shared over the Internet by the users has become an important component of news reporting. The framework proposed in this paper can be further extended and trained on more annotated real-world data so as to function as a tool for fact-checkers.
Social implications
In the current scenario wherein most of the image forgery detection studies attempt to assess whether the image is real or forged in an offline mode, it is crucial to identify any trending or potential forged image as early as possible. By learning from historical data, the proposed framework can aid in early prediction of forged images to detect the newly emerging forged images even before they occur. In summary, the proposed framework has a potential to mitigate physical spreading and psychological impact of forged images on social media.
Originality/value
This study focusses on copy-move and splicing techniques while integrating transfer learning concepts to classify forged images with high accuracy. The synergistic use of hitherto little explored image transformation techniques and customized convolutional neural network helps design a robust image forgery detection framework. Experiments and findings establish that the proposed framework accurately classifies forged images, thus mitigating the negative socio-cultural spread of misinformation.
Details
Keywords
Hu Luo, Haobin Ruan and Dawei Tu
The purpose of this paper is to propose a whole set of methods for underwater target detection, because most underwater objects have small samples, low quality underwater images…
Abstract
Purpose
The purpose of this paper is to propose a whole set of methods for underwater target detection, because most underwater objects have small samples, low quality underwater images problems such as detail loss, low contrast and color distortion, and verify the feasibility of the proposed methods through experiments.
Design/methodology/approach
The improved RGHS algorithm to enhance the original underwater target image is proposed, and then the YOLOv4 deep learning network for underwater small sample targets detection is improved based on the combination of traditional data expansion method and Mosaic algorithm, expanding the feature extraction capability with SPP (Spatial Pyramid Pooling) module after each feature extraction layer to extract richer feature information.
Findings
The experimental results, using the official dataset, reveal a 3.5% increase in average detection accuracy for three types of underwater biological targets compared to the traditional YOLOv4 algorithm. In underwater robot application testing, the proposed method achieves an impressive 94.73% average detection accuracy for the three types of underwater biological targets.
Originality/value
Underwater target detection is an important task for underwater robot application. However, most underwater targets have the characteristics of small samples, and the detection of small sample targets is a comprehensive problem because it is affected by the quality of underwater images. This paper provides a whole set of methods to solve the problems, which is of great significance to the application of underwater robot.
Details
Keywords
Qiang Wen, Lele Chen, Jingwen Jin, Jianhao Huang and HeLin Wan
Fixed mode noise and random mode noise always exist in the image sensor, which affects the imaging quality of the image sensor. The charge diffusion and color mixing between…
Abstract
Purpose
Fixed mode noise and random mode noise always exist in the image sensor, which affects the imaging quality of the image sensor. The charge diffusion and color mixing between pixels in the photoelectric conversion process belong to fixed mode noise. This study aims to improve the image sensor imaging quality by processing the fixed mode noise.
Design/methodology/approach
Through an iterative training of an ergoable long- and short-term memory recurrent neural network model, the authors obtain a neural network model able to compensate for image noise crosstalk. To overcome the lack of differences in the same color pixels on each template of the image sensor under flat-field light, the data before and after compensation were used as a new data set to further train the neural network iteratively.
Findings
The comparison of the images compensated by the two sets of neural network models shows that the gray value distribution is more concentrated and uniform. The middle and high frequency components in the spatial spectrum are all increased, indicating that the compensated image edges change faster and are more detailed (Hinton and Salakhutdinov, 2006; LeCun et al., 1998; Mohanty et al., 2016; Zang et al., 2023).
Originality/value
In this paper, the authors use the iterative learning color image pixel crosstalk compensation method to effectively alleviate the incomplete color mixing problem caused by the insufficient filter rate and the electric crosstalk problem caused by the lateral diffusion of the optical charge caused by the adjacent pixel potential trap.
Details
Keywords
Ismael Gómez-Talal, Lydia González-Serrano, José Luis Rojo-Álvarez and Pilar Talón-Ballestero
This study aims to address the global food waste problem in restaurants by analyzing customer sales information provided by restaurant tickets to gain valuable insights into…
Abstract
Purpose
This study aims to address the global food waste problem in restaurants by analyzing customer sales information provided by restaurant tickets to gain valuable insights into directing sales of perishable products and optimizing product purchases according to customer demand.
Design/methodology/approach
A system based on unsupervised machine learning (ML) data models was created to provide a simple and interpretable management tool. This system performs analysis based on two elements: first, it consolidates and visualizes mutual and nontrivial relationships between information features extracted from tickets using multicomponent analysis, bootstrap resampling and ML domain description. Second, it presents statistically relevant relationships in color-coded tables that provide food waste-related recommendations to restaurant managers.
Findings
The study identified relationships between products and customer sales in specific months. Other ticket elements have been related, such as products with days, hours or functional areas and products with products (cross-selling). Big data (BD) technology helped analyze restaurant tickets and obtain information on product sales behavior.
Research limitations/implications
This study addresses food waste in restaurants using BD and unsupervised ML models. Despite limitations in ticket information and lack of product detail, it opens up research opportunities in relationship analysis, cross-selling, productivity and deep learning applications.
Originality/value
The value and originality of this work lie in the application of BD and unsupervised ML technologies to analyze restaurant tickets and obtain information on product sales behavior. Better sales projection can adjust product purchases to customer demand, reducing food waste and optimizing profits.
Details
Keywords
Monica Puri Sikka, Alok Sarkar and Samridhi Garg
With the help of basic physics, the application of computer algorithms in the form of recent advances such as machine learning and neural networking in textile Industry has been…
Abstract
Purpose
With the help of basic physics, the application of computer algorithms in the form of recent advances such as machine learning and neural networking in textile Industry has been discussed in this review. Scientists have linked the underlying structural or chemical science of textile materials and discovered several strategies for completing some of the most time-consuming tasks with ease and precision. Since the 1980s, computer algorithms and machine learning have been used to aid the majority of the textile testing process. With the rise in demand for automation, deep learning, and neural networks, these two now handle the majority of testing and quality control operations in the form of image processing.
Design/methodology/approach
The state-of-the-art of artificial intelligence (AI) applications in the textile sector is reviewed in this paper. Based on several research problems and AI-based methods, the current literature is evaluated. The research issues are categorized into three categories based on the operation processes of the textile industry, including yarn manufacturing, fabric manufacture and coloration.
Findings
AI-assisted automation has improved not only machine efficiency but also overall industry operations. AI's fundamental concepts have been examined for real-world challenges. Several scientists conducted the majority of the case studies, and they confirmed that image analysis, backpropagation and neural networking may be specifically used as testing techniques in textile material testing. AI can be used to automate processes in various circumstances.
Originality/value
This research conducts a thorough analysis of artificial neural network applications in the textile sector.
Details
Keywords
Elavaar Kuzhali S. and Pushpa M.K.
COVID-19 has occurred in more than 150 countries and causes a huge impact on the health of many people. The main purpose of this work is, COVID-19 has occurred in more than 150…
Abstract
Purpose
COVID-19 has occurred in more than 150 countries and causes a huge impact on the health of many people. The main purpose of this work is, COVID-19 has occurred in more than 150 countries and causes a huge impact on the health of many people. The COVID-19 diagnosis is required to detect at the beginning stage and special attention should be given to them. The fastest way to detect the COVID-19 infected patients is detecting through radiology and radiography images. The few early studies describe the particular abnormalities of the infected patients in the chest radiograms. Even though some of the challenges occur in concluding the viral infection traces in X-ray images, the convolutional neural network (CNN) can determine the patterns of data between the normal and infected X-rays that increase the detection rate. Therefore, the researchers are focusing on developing a deep learning-based detection model.
Design/methodology/approach
The main intention of this proposal is to develop the enhanced lung segmentation and classification of diagnosing the COVID-19. The main processes of the proposed model are image pre-processing, lung segmentation and deep classification. Initially, the image enhancement is performed by contrast enhancement and filtering approaches. Once the image is pre-processed, the optimal lung segmentation is done by the adaptive fuzzy-based region growing (AFRG) technique, in which the constant function for fusion is optimized by the modified deer hunting optimization algorithm (M-DHOA). Further, a well-performing deep learning algorithm termed adaptive CNN (A-CNN) is adopted for performing the classification, in which the hidden neurons are tuned by the proposed DHOA to enhance the detection accuracy. The simulation results illustrate that the proposed model has more possibilities to increase the COVID-19 testing methods on the publicly available data sets.
Findings
From the experimental analysis, the accuracy of the proposed M-DHOA–CNN was 5.84%, 5.23%, 6.25% and 8.33% superior to recurrent neural network, neural networks, support vector machine and K-nearest neighbor, respectively. Thus, the segmentation and classification performance of the developed COVID-19 diagnosis by AFRG and A-CNN has outperformed the existing techniques.
Originality/value
This paper adopts the latest optimization algorithm called M-DHOA to improve the performance of lung segmentation and classification in COVID-19 diagnosis using adaptive K-means with region growing fusion and A-CNN. To the best of the authors’ knowledge, this is the first work that uses M-DHOA for improved segmentation and classification steps for increasing the convergence rate of diagnosis.
Details
Keywords
Chun Tian, Gengwei Zhai, Mengling Wu, Jiajun Zhou and Yaojie Li
In response to the problem of insufficient traction/braking adhesion force caused by the existence of the third-body medium on the rail surface, this study aims to analyze the…
Abstract
Purpose
In response to the problem of insufficient traction/braking adhesion force caused by the existence of the third-body medium on the rail surface, this study aims to analyze the utilization of wheel-rail adhesion coefficient under different medium conditions and propose relevant measures for reasonable and optimized utilization of adhesion to ensure the traction/braking performance and operation safety of trains.
Design/methodology/approach
Based on the PLS-160 wheel-rail adhesion simulation test rig, the study investigates the variation patterns of maximum utilized adhesion characteristics on the rail surface under different conditions of small creepage and large slip. Through statistical analysis of multiple sets of experimental data, the statistical distribution patterns of maximum utilized adhesion on the rail surface are obtained, and a method for analyzing wheel-rail adhesion redundancy based on normal distribution is proposed. The study analyzes the utilization of traction/braking adhesion, as well as adhesion redundancy, for different medium under small creepage and large slip conditions. Based on these findings, relevant measures for the reasonable and optimized utilization of adhesion are derived.
Findings
When the third-body medium exists on the rail surface, the train should adopt the low-level service braking to avoid the braking skidding by extending the braking distance. Compared with the current adhesion control strategy of small creepage, adopting appropriate strategies to control the train’s adhesion coefficient near the second peak point of the adhesion coefficient-slip ratio curve in large slip can effectively improve the traction/braking adhesion redundancy and the upper limit of adhesion utilization, thereby ensuring the traction/braking performance and operation safety of the train.
Originality/value
Most existing studies focus on the wheel-rail adhesion coefficient values and variation patterns under different medium conditions, without considering whether the rail surface with different medium can provide sufficient traction/braking utilized adhesion coefficient for the train. Therefore, there is a risk of traction overspeeding/braking skidding. This study analyzes whether the rail surface with different medium can provide sufficient traction/braking utilized adhesion coefficient for the train and whether there is redundancy. Based on these findings, relevant measures for the reasonable and optimized utilization of adhesion are derived to further ensure operation safety of the train.
Details
Keywords
Xiumei Cai, Xi Yang and Chengmao Wu
Multi-view fuzzy clustering algorithms are not widely used in image segmentation, and many of these algorithms are lacking in robustness. The purpose of this paper is to…
Abstract
Purpose
Multi-view fuzzy clustering algorithms are not widely used in image segmentation, and many of these algorithms are lacking in robustness. The purpose of this paper is to investigate a new algorithm that can segment the image better and retain as much detailed information about the image as possible when segmenting noisy images.
Design/methodology/approach
The authors present a novel multi-view fuzzy c-means (FCM) clustering algorithm that includes an automatic view-weight learning mechanism. Firstly, this algorithm introduces a view-weight factor that can automatically adjust the weight of different views, thereby allowing each view to obtain the best possible weight. Secondly, the algorithm incorporates a weighted fuzzy factor, which serves to obtain local spatial information and local grayscale information to preserve image details as much as possible. Finally, in order to weaken the effects of noise and outliers in image segmentation, this algorithm employs the kernel distance measure instead of the Euclidean distance.
Findings
The authors added different kinds of noise to images and conducted a large number of experimental tests. The results show that the proposed algorithm performs better and is more accurate than previous multi-view fuzzy clustering algorithms in solving the problem of noisy image segmentation.
Originality/value
Most of the existing multi-view clustering algorithms are for multi-view datasets, and the multi-view fuzzy clustering algorithms are unable to eliminate noise points and outliers when dealing with noisy images. The algorithm proposed in this paper has stronger noise immunity and can better preserve the details of the original image.
Details