Search results

1 – 10 of 334
Article
Publication date: 12 September 2023

Wei Shi, Jing Zhang and Shaoyi He

With the rapid development of short videos in China, the public has become accustomed to using short videos to express their opinions. This paper aims to solve problems such as…

118

Abstract

Purpose

With the rapid development of short videos in China, the public has become accustomed to using short videos to express their opinions. This paper aims to solve problems such as how to represent the features of different modalities and achieve effective cross-modal feature fusion when analyzing the multi-modal sentiment of Chinese short videos (CSVs).

Design/methodology/approach

This paper aims to propose a sentiment analysis model MSCNN-CPL-CAFF using multi-scale convolutional neural network and cross attention fusion mechanism to analyze the CSVs. The audio-visual and textual data of CSVs themed on “COVID-19, catering industry” are collected from CSV platform Douyin first, and then a comparative analysis is conducted with advanced baseline models.

Findings

The sample number of the weak negative and neutral sentiment is the largest, and the sample number of the positive and weak positive sentiment is relatively small, accounting for only about 11% of the total samples. The MSCNN-CPL-CAFF model has achieved the Acc-2, Acc-3 and F1 score of 85.01%, 74.16 and 84.84%, respectively, which outperforms the highest value of baseline methods in accuracy and achieves competitive computation speed.

Practical implications

This research offers some implications regarding the impact of COVID-19 on catering industry in China by focusing on multi-modal sentiment of CSVs. The methodology can be utilized to analyze the opinions of the general public on social media platform and to categorize them accordingly.

Originality/value

This paper presents a novel deep-learning multimodal sentiment analysis model, which provides a new perspective for public opinion research on the short video platform.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 10 January 2024

Sara El-Ateif, Ali Idri and José Luis Fernández-Alemán

COVID-19 continues to spread, and cause increasing deaths. Physicians diagnose COVID-19 using not only real-time polymerase chain reaction but also the computed tomography (CT…

Abstract

Purpose

COVID-19 continues to spread, and cause increasing deaths. Physicians diagnose COVID-19 using not only real-time polymerase chain reaction but also the computed tomography (CT) and chest x-ray (CXR) modalities, depending on the stage of infection. However, with so many patients and so few doctors, it has become difficult to keep abreast of the disease. Deep learning models have been developed in order to assist in this respect, and vision transformers are currently state-of-the-art methods, but most techniques currently focus only on one modality (CXR).

Design/methodology/approach

This work aims to leverage the benefits of both CT and CXR to improve COVID-19 diagnosis. This paper studies the differences between using convolutional MobileNetV2, ViT DeiT and Swin Transformer models when training from scratch and pretraining on the MedNIST medical dataset rather than the ImageNet dataset of natural images. The comparison is made by reporting six performance metrics, the Scott–Knott Effect Size Difference, Wilcoxon statistical test and the Borda Count method. We also use the Grad-CAM algorithm to study the model's interpretability. Finally, the model's robustness is tested by evaluating it on Gaussian noised images.

Findings

Although pretrained MobileNetV2 was the best model in terms of performance, the best model in terms of performance, interpretability, and robustness to noise is the trained from scratch Swin Transformer using the CXR (accuracy = 93.21 per cent) and CT (accuracy = 94.14 per cent) modalities.

Originality/value

Models compared are pretrained on MedNIST and leverage both the CT and CXR modalities.

Details

Data Technologies and Applications, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 8 July 2022

Mukesh Soni, Nihar Ranjan Nayak, Ashima Kalra, Sheshang Degadwala, Nikhil Kumar Singh and Shweta Singh

The purpose of this paper is to improve the existing paradigm of edge computing to maintain a balanced energy usage.

Abstract

Purpose

The purpose of this paper is to improve the existing paradigm of edge computing to maintain a balanced energy usage.

Design/methodology/approach

The new greedy algorithm is proposed to balance the energy consumption in edge computing.

Findings

The new greedy algorithm can balance energy more efficiently than the random approach by an average of 66.59 percent.

Originality/value

The results are shown in this paper which are better as compared to existing algorithms.

Details

International Journal of Pervasive Computing and Communications, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 7 May 2024

Xinzhe Li, Qinglong Li, Dasom Jeong and Jaekyeong Kim

Most previous studies predicting review helpfulness ignored the significance of deep features embedded in review text and instead relied on hand-crafted features. Hand-crafted and…

Abstract

Purpose

Most previous studies predicting review helpfulness ignored the significance of deep features embedded in review text and instead relied on hand-crafted features. Hand-crafted and deep features have the advantages of high interpretability and predictive accuracy. This study aims to propose a novel review helpfulness prediction model that uses deep learning (DL) techniques to consider the complementarity between hand-crafted and deep features.

Design/methodology/approach

First, an advanced convolutional neural network was applied to extract deep features from unstructured review text. Second, this study used previous studies to extract hand-crafted features that impact the helpfulness of reviews and enhance their interpretability. Third, this study incorporated deep and hand-crafted features into a review helpfulness prediction model and evaluated its performance using the Yelp.com data set. To measure the performance of the proposed model, this study used 2,417,796 restaurant reviews.

Findings

Extensive experiments confirmed that the proposed methodology performs better than traditional machine learning methods. Moreover, this study confirms through an empirical analysis that combining hand-crafted and deep features demonstrates better prediction performance.

Originality/value

To the best of the authors’ knowledge, this is one of the first studies to apply DL techniques and use structured and unstructured data to predict review helpfulness in the restaurant context. In addition, an advanced feature-fusion method was adopted to better use the extracted feature information and identify the complementarity between features.

研究目的

大多数先前预测评论有用性的研究忽视了嵌入在评论文本中的深层特征的重要性, 而主要依赖手工制作的特征。手工制作和深层特征具有高解释性和预测准确性的优势。本研究提出了一种新颖的评论有用性预测模型, 利用深度学习技术来考虑手工制作特征和深层特征之间的互补性。

研究方法

首先, 采用先进的卷积神经网络从非结构化的评论文本中提取深层特征。其次, 本研究利用先前研究中提取的手工制作特征, 这些特征影响了评论的有用性并增强了其解释性。第三, 本研究将深层特征和手工制作特征结合到一个评论有用性预测模型中, 并使用Yelp.com数据集对其性能进行评估。为了衡量所提出模型的性能, 本研究使用了2,417,796条餐厅评论。

研究发现

广泛的实验验证了所提出的方法优于传统的机器学习方法。此外, 通过实证分析, 本研究证实了结合手工制作和深层特征可以展现出更好的预测性能。

研究创新

据我们所知, 这是首个在餐厅评论预测中应用深度学习技术, 并结合了结构化和非结构化数据来预测评论有用性的研究之一。此外, 本研究采用了先进的特征融合方法, 更好地利用了提取的特征信息, 并识别了特征之间的互补性。

Article
Publication date: 31 October 2023

Yangze Liang and Zhao Xu

Monitoring of the quality of precast concrete (PC) components is crucial for the success of prefabricated construction projects. Currently, quality monitoring of PC components…

Abstract

Purpose

Monitoring of the quality of precast concrete (PC) components is crucial for the success of prefabricated construction projects. Currently, quality monitoring of PC components during the construction phase is predominantly done manually, resulting in low efficiency and hindering the progress of intelligent construction. This paper presents an intelligent inspection method for assessing the appearance quality of PC components, utilizing an enhanced you look only once (YOLO) model and multi-source data. The aim of this research is to achieve automated management of the appearance quality of precast components in the prefabricated construction process through digital means.

Design/methodology/approach

The paper begins by establishing an improved YOLO model and an image dataset for evaluating appearance quality. Through object detection in the images, a preliminary and efficient assessment of the precast components' appearance quality is achieved. Moreover, the detection results are mapped onto the point cloud for high-precision quality inspection. In the case of precast components with quality defects, precise quality inspection is conducted by combining the three-dimensional model data obtained from forward design conversion with the captured point cloud data through registration. Additionally, the paper proposes a framework for an automated inspection platform dedicated to assessing appearance quality in prefabricated buildings, encompassing the platform's hardware network.

Findings

The improved YOLO model achieved a best mean average precision of 85.02% on the VOC2007 dataset, surpassing the performance of most similar models. After targeted training, the model exhibits excellent recognition capabilities for the four common appearance quality defects. When mapped onto the point cloud, the accuracy of quality inspection based on point cloud data and forward design is within 0.1 mm. The appearance quality inspection platform enables feedback and optimization of quality issues.

Originality/value

The proposed method in this study enables high-precision, visualized and automated detection of the appearance quality of PC components. It effectively meets the demand for quality inspection of precast components on construction sites of prefabricated buildings, providing technological support for the development of intelligent construction. The design of the appearance quality inspection platform's logic and framework facilitates the integration of the method, laying the foundation for efficient quality management in the future.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Open Access
Article
Publication date: 29 September 2022

Manju Priya Arthanarisamy Ramaswamy and Suja Palaniswamy

The aim of this study is to investigate subject independent emotion recognition capabilities of EEG and peripheral physiological signals namely: electroocoulogram (EOG)…

1059

Abstract

Purpose

The aim of this study is to investigate subject independent emotion recognition capabilities of EEG and peripheral physiological signals namely: electroocoulogram (EOG), electromyography (EMG), electrodermal activity (EDA), temperature, plethysmograph and respiration. The experiments are conducted on both modalities independently and in combination. This study arranges the physiological signals in order based on the prediction accuracy obtained on test data using time and frequency domain features.

Design/methodology/approach

DEAP dataset is used in this experiment. Time and frequency domain features of EEG and physiological signals are extracted, followed by correlation-based feature selection. Classifiers namely – Naïve Bayes, logistic regression, linear discriminant analysis, quadratic discriminant analysis, logit boost and stacking are trained on the selected features. Based on the performance of the classifiers on the test set, the best modality for each dimension of emotion is identified.

Findings

 The experimental results with EEG as one modality and all physiological signals as another modality indicate that EEG signals are better at arousal prediction compared to physiological signals by 7.18%, while physiological signals are better at valence prediction compared to EEG signals by 3.51%. The valence prediction accuracy of EOG is superior to zygomaticus electromyography (zEMG) and EDA by 1.75% at the cost of higher number of electrodes. This paper concludes that valence can be measured from the eyes (EOG) while arousal can be measured from the changes in blood volume (plethysmograph). The sorted order of physiological signals based on arousal prediction accuracy is plethysmograph, EOG (hEOG + vEOG), vEOG, hEOG, zEMG, tEMG, temperature, EMG (tEMG + zEMG), respiration, EDA, while based on valence prediction accuracy the sorted order is EOG (hEOG + vEOG), EDA, zEMG, hEOG, respiration, tEMG, vEOG, EMG (tEMG + zEMG), temperature and plethysmograph.

Originality/value

Many of the emotion recognition studies in literature are subject dependent and the limited subject independent emotion recognition studies in the literature report an average of leave one subject out (LOSO) validation result as accuracy. The work reported in this paper sets the baseline for subject independent emotion recognition using DEAP dataset by clearly specifying the subjects used in training and test set. In addition, this work specifies the cut-off score used to classify the scale as low or high in arousal and valence dimensions. Generally, statistical features are used for emotion recognition using physiological signals as a modality, whereas in this work, time and frequency domain features of physiological signals and EEG are used. This paper concludes that valence can be identified from EOG while arousal can be predicted from plethysmograph.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Open Access
Article
Publication date: 6 December 2022

Worapan Kusakunniran, Sarattha Karnjanapreechakorn, Pitipol Choopong, Thanongchai Siriapisith, Nattaporn Tesavibul, Nopasak Phasukkijwatana, Supalert Prakhunhungsit and Sutasinee Boonsopon

This paper aims to propose a solution for detecting and grading diabetic retinopathy (DR) in retinal images using a convolutional neural network (CNN)-based approach. It could…

1247

Abstract

Purpose

This paper aims to propose a solution for detecting and grading diabetic retinopathy (DR) in retinal images using a convolutional neural network (CNN)-based approach. It could classify input retinal images into a normal class or an abnormal class, which would be further split into four stages of abnormalities automatically.

Design/methodology/approach

The proposed solution is developed based on a newly proposed CNN architecture, namely, DeepRoot. It consists of one main branch, which is connected by two side branches. The main branch is responsible for the primary feature extractor of both high-level and low-level features of retinal images. Then, the side branches further extract more complex and detailed features from the features outputted from the main branch. They are designed to capture details of small traces of DR in retinal images, using modified zoom-in/zoom-out and attention layers.

Findings

The proposed method is trained, validated and tested on the Kaggle dataset. The regularization of the trained model is evaluated using unseen data samples, which were self-collected from a real scenario from a hospital. It achieves a promising performance with a sensitivity of 98.18% under the two classes scenario.

Originality/value

The new CNN-based architecture (i.e. DeepRoot) is introduced with the concept of a multi-branch network. It could assist in solving a problem of an unbalanced dataset, especially when there are common characteristics across different classes (i.e. four stages of DR). Different classes could be outputted at different depths of the network.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 19 December 2023

Jinchao Huang

Single-shot multi-category clothing recognition and retrieval play a crucial role in online searching and offline settlement scenarios. Existing clothing recognition methods based…

Abstract

Purpose

Single-shot multi-category clothing recognition and retrieval play a crucial role in online searching and offline settlement scenarios. Existing clothing recognition methods based on RGBD clothing images often suffer from high-dimensional feature representations, leading to compromised performance and efficiency.

Design/methodology/approach

To address this issue, this paper proposes a novel method called Manifold Embedded Discriminative Feature Selection (MEDFS) to select global and local features, thereby reducing the dimensionality of the feature representation and improving performance. Specifically, by combining three global features and three local features, a low-dimensional embedding is constructed to capture the correlations between features and categories. The MEDFS method designs an optimization framework utilizing manifold mapping and sparse regularization to achieve feature selection. The optimization objective is solved using an alternating iterative strategy, ensuring convergence.

Findings

Empirical studies conducted on a publicly available RGBD clothing image dataset demonstrate that the proposed MEDFS method achieves highly competitive clothing classification performance while maintaining efficiency in clothing recognition and retrieval.

Originality/value

This paper introduces a novel approach for multi-category clothing recognition and retrieval, incorporating the selection of global and local features. The proposed method holds potential for practical applications in real-world clothing scenarios.

Details

International Journal of Intelligent Computing and Cybernetics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 28 December 2023

Ankang Ji, Xiaolong Xue, Limao Zhang, Xiaowei Luo and Qingpeng Man

Crack detection of pavement is a critical task in the periodic survey. Efficient, effective and consistent tracking of the road conditions by identifying and locating crack…

Abstract

Purpose

Crack detection of pavement is a critical task in the periodic survey. Efficient, effective and consistent tracking of the road conditions by identifying and locating crack contributes to establishing an appropriate road maintenance and repair strategy from the promptly informed managers but still remaining a significant challenge. This research seeks to propose practical solutions for targeting the automatic crack detection from images with efficient productivity and cost-effectiveness, thereby improving the pavement performance.

Design/methodology/approach

This research applies a novel deep learning method named TransUnet for crack detection, which is structured based on Transformer, combined with convolutional neural networks as encoder by leveraging a global self-attention mechanism to better extract features for enhancing automatic identification. Afterward, the detected cracks are used to quantify morphological features from five indicators, such as length, mean width, maximum width, area and ratio. Those analyses can provide valuable information for engineers to assess the pavement condition with efficient productivity.

Findings

In the training process, the TransUnet is fed by a crack dataset generated by the data augmentation with a resolution of 224 × 224 pixels. Subsequently, a test set containing 80 new images is used for crack detection task based on the best selected TransUnet with a learning rate of 0.01 and a batch size of 1, achieving an accuracy of 0.8927, a precision of 0.8813, a recall of 0.8904, an F1-measure and dice of 0.8813, and a Mean Intersection over Union of 0.8082, respectively. Comparisons with several state-of-the-art methods indicate that the developed approach in this research outperforms with greater efficiency and higher reliability.

Originality/value

The developed approach combines TransUnet with an integrated quantification algorithm for crack detection and quantification, performing excellently in terms of comparisons and evaluation metrics, which can provide solutions with potentially serving as the basis for an automated, cost-effective pavement condition assessment scheme.

Details

Engineering, Construction and Architectural Management, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0969-9988

Keywords

Article
Publication date: 16 August 2022

Anil Kumar Gona and Subramoniam M.

Biometric scans using fingerprints are widely used for security purposes. Eventually, for authentication purposes, fingerprint scans are not very reliable because they can be…

Abstract

Purpose

Biometric scans using fingerprints are widely used for security purposes. Eventually, for authentication purposes, fingerprint scans are not very reliable because they can be faked by obtaining a sample of the fingerprint of the person. There are a few spoof detection techniques available to reduce the incidence of spoofing of the biometric system. Among them, the most commonly used is the binary classification technique that detects real or fake fingerprints based on the fingerprint samples provided during training. However, this technique fails when it is provided with samples formed using other spoofing techniques that are different from the spoofing techniques covered in the training samples. This paper aims to improve the liveness detection accuracy by fusing electrocardiogram (ECG) and fingerprint.

Design/methodology/approach

In this paper, to avoid this limitation, an efficient liveness detection algorithm is developed using the fusion of ECG signals captured from the fingertips and fingerprint data in Internet of Things (IoT) environment. The ECG signal will ensure the detection of real fingerprint samples from fake ones.

Findings

Single model fingerprint methods have some disadvantages, such as noisy data and position of the fingerprint. To overcome this, fusion of both ECG and fingerprint is done so that the combined data improves the detection accuracy.

Originality/value

System security is improved in this approach, and the fingerprint recognition rate is also improved. IoT-based approach is used in this work to reduce the computation burden of data processing systems.

Details

International Journal of Pervasive Computing and Communications, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1742-7371

Keywords

1 – 10 of 334