Search results
1 – 10 of 54Huaxiang Song, Chai Wei and Zhou Yong
The paper aims to tackle the classification of Remote Sensing Images (RSIs), which presents a significant challenge for computer algorithms due to the inherent characteristics of…
Abstract
Purpose
The paper aims to tackle the classification of Remote Sensing Images (RSIs), which presents a significant challenge for computer algorithms due to the inherent characteristics of clustered ground objects and noisy backgrounds. Recent research typically leverages larger volume models to achieve advanced performance. However, the operating environments of remote sensing commonly cannot provide unconstrained computational and storage resources. It requires lightweight algorithms with exceptional generalization capabilities.
Design/methodology/approach
This study introduces an efficient knowledge distillation (KD) method to build a lightweight yet precise convolutional neural network (CNN) classifier. This method also aims to substantially decrease the training time expenses commonly linked with traditional KD techniques. This approach entails extensive alterations to both the model training framework and the distillation process, each tailored to the unique characteristics of RSIs. In particular, this study establishes a robust ensemble teacher by independently training two CNN models using a customized, efficient training algorithm. Following this, this study modifies a KD loss function to mitigate the suppression of non-target category predictions, which are essential for capturing the inter- and intra-similarity of RSIs.
Findings
This study validated the student model, termed KD-enhanced network (KDE-Net), obtained through the KD process on three benchmark RSI data sets. The KDE-Net surpasses 42 other state-of-the-art methods in the literature published from 2020 to 2023. Compared to the top-ranked method’s performance on the challenging NWPU45 data set, KDE-Net demonstrated a noticeable 0.4% increase in overall accuracy with a significant 88% reduction in parameters. Meanwhile, this study’s reformed KD framework significantly enhances the knowledge transfer speed by at least three times.
Originality/value
This study illustrates that the logit-based KD technique can effectively develop lightweight CNN classifiers for RSI classification without substantial sacrifices in computation and storage costs. Compared to neural architecture search or other methods aiming to provide lightweight solutions, this study’s KDE-Net, based on the inherent characteristics of RSIs, is currently more efficient in constructing accurate yet lightweight classifiers for RSI classification.
Details
Keywords
Juan Yang, Zhenkun Li and Xu Du
Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their…
Abstract
Purpose
Although numerous signal modalities are available for emotion recognition, audio and visual modalities are the most common and predominant forms for human beings to express their emotional states in daily communication. Therefore, how to achieve automatic and accurate audiovisual emotion recognition is significantly important for developing engaging and empathetic human–computer interaction environment. However, two major challenges exist in the field of audiovisual emotion recognition: (1) how to effectively capture representations of each single modality and eliminate redundant features and (2) how to efficiently integrate information from these two modalities to generate discriminative representations.
Design/methodology/approach
A novel key-frame extraction-based attention fusion network (KE-AFN) is proposed for audiovisual emotion recognition. KE-AFN attempts to integrate key-frame extraction with multimodal interaction and fusion to enhance audiovisual representations and reduce redundant computation, filling the research gaps of existing approaches. Specifically, the local maximum–based content analysis is designed to extract key-frames from videos for the purpose of eliminating data redundancy. Two modules, including “Multi-head Attention-based Intra-modality Interaction Module” and “Multi-head Attention-based Cross-modality Interaction Module”, are proposed to mine and capture intra- and cross-modality interactions for further reducing data redundancy and producing more powerful multimodal representations.
Findings
Extensive experiments on two benchmark datasets (i.e. RAVDESS and CMU-MOSEI) demonstrate the effectiveness and rationality of KE-AFN. Specifically, (1) KE-AFN is superior to state-of-the-art baselines for audiovisual emotion recognition. (2) Exploring the supplementary and complementary information of different modalities can provide more emotional clues for better emotion recognition. (3) The proposed key-frame extraction strategy can enhance the performance by more than 2.79 per cent on accuracy. (4) Both exploring intra- and cross-modality interactions and employing attention-based audiovisual fusion can lead to better prediction performance.
Originality/value
The proposed KE-AFN can support the development of engaging and empathetic human–computer interaction environment.
Details
Keywords
Faisal Mehraj Wani, Jayaprakash Vemuri and Rajaram Chenna
Near-fault pulse-like ground motions have distinct and very severe effects on reinforced concrete (RC) structures. However, there is a paucity of recorded data from Near-Fault…
Abstract
Purpose
Near-fault pulse-like ground motions have distinct and very severe effects on reinforced concrete (RC) structures. However, there is a paucity of recorded data from Near-Fault Ground Motions (NFGMs), and thus forecasting the dynamic seismic response of structures, using conventional techniques, under such intense ground motions has remained a challenge.
Design/methodology/approach
The present study utilizes a 2D finite element model of an RC structure subjected to near-fault pulse-like ground motions with a focus on the storey drift ratio (SDR) as the key demand parameter. Five machine learning classifiers (MLCs), namely decision tree, k-nearest neighbor, random forest, support vector machine and Naïve Bayes classifier , were evaluated to classify the damage states of the RC structure.
Findings
The results such as confusion matrix, accuracy and mean square error indicate that the Naïve Bayes classifier model outperforms other MLCs with 80.0% accuracy. Furthermore, three MLC models with accuracy greater than 75% were trained using a voting classifier to enhance the performance score of the models. Finally, a sensitivity analysis was performed to evaluate the model's resilience and dependability.
Originality/value
The objective of the current study is to predict the nonlinear storey drift demand for low-rise RC structures using machine learning techniques, instead of labor-intensive nonlinear dynamic analysis.
Details
Keywords
Prajakta Thakare and Ravi Sankar V.
Agriculture is the backbone of a country, contributing more than half of the sector of economy throughout the world. The need for precision agriculture is essential in evaluating…
Abstract
Purpose
Agriculture is the backbone of a country, contributing more than half of the sector of economy throughout the world. The need for precision agriculture is essential in evaluating the conditions of the crops with the aim of determining the proper selection of pesticides. The conventional method of pest detection fails to be stable and provides limited accuracy in the prediction. This paper aims to propose an automatic pest detection module for the accurate detection of pests using the hybrid optimization controlled deep learning model.
Design/methodology/approach
The paper proposes an advanced pest detection strategy based on deep learning strategy through wireless sensor network (WSN) in the agricultural fields. Initially, the WSN consisting of number of nodes and a sink are clustered as number of clusters. Each cluster comprises a cluster head (CH) and a number of nodes, where the CH involves in the transfer of data to the sink node of the WSN and the CH is selected using the fractional ant bee colony optimization (FABC) algorithm. The routing process is executed using the protruder optimization algorithm that helps in the transfer of image data to the sink node through the optimal CH. The sink node acts as the data aggregator and the collection of image data thus obtained acts as the input database to be processed to find the type of pest in the agricultural field. The image data is pre-processed to remove the artifacts present in the image and the pre-processed image is then subjected to feature extraction process, through which the significant local directional pattern, local binary pattern, local optimal-oriented pattern (LOOP) and local ternary pattern (LTP) features are extracted. The extracted features are then fed to the deep-convolutional neural network (CNN) in such a way to detect the type of pests in the agricultural field. The weights of the deep-CNN are tuned optimally using the proposed MFGHO optimization algorithm that is developed with the combined characteristics of navigating search agents and the swarming search agents.
Findings
The analysis using insect identification from habitus image Database based on the performance metrics, such as accuracy, specificity and sensitivity, reveals the effectiveness of the proposed MFGHO-based deep-CNN in detecting the pests in crops. The analysis proves that the proposed classifier using the FABC+protruder optimization-based data aggregation strategy obtains an accuracy of 94.3482%, sensitivity of 93.3247% and the specificity of 94.5263%, which is high as compared to the existing methods.
Originality/value
The proposed MFGHO optimization-based deep-CNN is used for the detection of pest in the crop fields to ensure the better selection of proper cost-effective pesticides for the crop fields in such a way to increase the production. The proposed MFGHO algorithm is developed with the integrated characteristic features of navigating search agents and the swarming search agents in such a way to facilitate the optimal tuning of the hyperparameters in the deep-CNN classifier for the detection of pests in the crop fields.
Details
Keywords
S. Thavasi and T. Revathi
With so many placement opportunities around the students in their final or prefinal year, they start to feel the strain of the season. The students feel the need to be aware of…
Abstract
Purpose
With so many placement opportunities around the students in their final or prefinal year, they start to feel the strain of the season. The students feel the need to be aware of their position and how to increase their chances of being hired. Hence, a system to guide their career is one of the needs of the day.
Design/methodology/approach
The job role prediction system utilizes machine learning techniques such as Naïve Bayes, K-Nearest Neighbor, Support Vector machines (SVM) and Artificial Neural Networks (ANN) to suggest a student’s job role based on their academic performance and course outcomes (CO), out of which ANN performs better. The system uses the Mepco Schlenk Engineering College curriculum, placement and students’ Assessment data sets, in which the CO and syllabus are used to determine the skills that the student has gained from their courses. The necessary skills for a job position are then extracted from the job advertisements. The system compares the student’s skills with the required skills for the job role based on the placement prediction result.
Findings
The system predicts placement possibilities with an accuracy of 93.33 and 98% precision. Also, the skill analysis for students gives the students information about their skill-set strengths and weaknesses.
Research limitations/implications
For skill-set analysis, only the direct assessment of the students is considered. Indirect assessment shall also be considered for future scope.
Practical implications
The model is adaptable and flexible (customizable) to any type of academic institute or universities.
Social implications
The research will be very much useful for the students community to bridge the gap between the academic and industrial needs.
Originality/value
Several works are done for career guidance for the students. However, these career guidance methodologies are designed only using the curriculum and students’ basic personal information. The proposed system will consider the students’ academic performance through direct assessment, along with their curriculum and basic personal information.
Details
Keywords
Feng Qian, Yongsheng Tu, Chenyu Hou and Bin Cao
Automatic modulation recognition (AMR) is a challenging problem in intelligent communication systems and has wide application prospects. At present, although many AMR methods…
Abstract
Purpose
Automatic modulation recognition (AMR) is a challenging problem in intelligent communication systems and has wide application prospects. At present, although many AMR methods based on deep learning have been proposed, the methods proposed by these works cannot be directly applied to the actual wireless communication scenario, because there are usually two kinds of dilemmas when recognizing the real modulated signal, namely, long sequence and noise. This paper aims to effectively process in-phase quadrature (IQ) sequences of very long signals interfered by noise.
Design/methodology/approach
This paper proposes a general model for a modulation classifier based on a two-layer nested structure of long short-term memory (LSTM) networks, called a two-layer nested structure (TLN)-LSTM, which exploits the time sensitivity of LSTM and the ability of the nested network structure to extract more features, and can achieve effective processing of ultra-long signal IQ sequences collected from real wireless communication scenarios that are interfered by noise.
Findings
Experimental results show that our proposed model has higher recognition accuracy for five types of modulation signals, including amplitude modulation, frequency modulation, gaussian minimum shift keying, quadrature phase shift keying and differential quadrature phase shift keying, collected from real wireless communication scenarios. The overall classification accuracy of the proposed model for these signals can reach 73.11%, compared with 40.84% for the baseline model. Moreover, this model can also achieve high classification performance for analog signals with the same modulation method in the public data set HKDD_AMC36.
Originality/value
At present, although many AMR methods based on deep learning have been proposed, these works are based on the model’s classification results of various modulated signals in the AMR public data set to evaluate the signal recognition performance of the proposed method rather than collecting real modulated signals for identification in actual wireless communication scenarios. The methods proposed in these works cannot be directly applied to actual wireless communication scenarios. Therefore, this paper proposes a new AMR method, dedicated to the effective processing of the collected ultra-long signal IQ sequences that are interfered by noise.
Details
Keywords
Anna Korotysheva and Sergey Zhukov
This study aims to comprehensively address the challenge of delineating traffic scenarios in video footage captured by an embedded camera within an autonomous vehicle.
Abstract
Purpose
This study aims to comprehensively address the challenge of delineating traffic scenarios in video footage captured by an embedded camera within an autonomous vehicle.
Design/methodology/approach
This methodology involves systematically elucidating the traffic context by leveraging data from the object recognition subsystem embedded in vehicular road infrastructure. A knowledge base containing production rules and logical inference mechanism was developed. These components enable real-time procedures for describing traffic situations.
Findings
The production rule system focuses on semantically modeling entities that are categorized as traffic lights and road signs. The effectiveness of the methodology was tested experimentally using diverse image datasets representing various meteorological conditions. A thorough analysis of the results was conducted, which opens avenues for future research.
Originality/value
Originality lies in the potential integration of the developed methodology into an autonomous vehicle’s control system, working alongside other procedures that analyze the current situation. These applications extend to driver assistance systems, harmonized with augmented reality technology, and enhance human decision-making processes.
Details
Keywords
Daniel Šandor and Marina Bagić Babac
Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning…
Abstract
Purpose
Sarcasm is a linguistic expression that usually carries the opposite meaning of what is being said by words, thus making it difficult for machines to discover the actual meaning. It is mainly distinguished by the inflection with which it is spoken, with an undercurrent of irony, and is largely dependent on context, which makes it a difficult task for computational analysis. Moreover, sarcasm expresses negative sentiments using positive words, allowing it to easily confuse sentiment analysis models. This paper aims to demonstrate the task of sarcasm detection using the approach of machine and deep learning.
Design/methodology/approach
For the purpose of sarcasm detection, machine and deep learning models were used on a data set consisting of 1.3 million social media comments, including both sarcastic and non-sarcastic comments. The data set was pre-processed using natural language processing methods, and additional features were extracted and analysed. Several machine learning models, including logistic regression, ridge regression, linear support vector and support vector machines, along with two deep learning models based on bidirectional long short-term memory and one bidirectional encoder representations from transformers (BERT)-based model, were implemented, evaluated and compared.
Findings
The performance of machine and deep learning models was compared in the task of sarcasm detection, and possible ways of improvement were discussed. Deep learning models showed more promise, performance-wise, for this type of task. Specifically, a state-of-the-art model in natural language processing, namely, BERT-based model, outperformed other machine and deep learning models.
Originality/value
This study compared the performance of the various machine and deep learning models in the task of sarcasm detection using the data set of 1.3 million comments from social media.
Details
Keywords
Annarita Colamatteo, Marcello Sansone and Giuliano Iorio
This paper aims to examine the impact of the COVID-19 pandemic on the private label food products, specifically assessing the stability and changes in factors influencing…
Abstract
Purpose
This paper aims to examine the impact of the COVID-19 pandemic on the private label food products, specifically assessing the stability and changes in factors influencing purchasing decisions, and comparing pre-pandemic and post-pandemic datasets.
Design/methodology/approach
The study employs the Extra Tree Classifier method, a robust quantitative approach, to analyse data collected from questionnaires distributed among two distinct consumer samples. This methodological choice is explicitly adopted to provide a clear classification of factors influencing consumer preferences for private label products, surpassing conventional qualitative methods.
Findings
Despite the profound disruptions caused by the COVID-19 pandemic, this research underscores the persistent hierarchy of factors shaping consumer choices in the private label food market, showing an overall stability in consumer behaviour. At the same time, the analysis of individual variables highlights the positive increase in those related to product quality, health, taste, and communication.
Research limitations/implications
The use of online surveys for data collection may introduce a self-selection bias, and the non-probabilistic sampling method could limit the generalizability of the results.
Practical implications
Practical implications suggest that managers in the private label industry should prioritize enhancing quality control, ensuring effective communication, and dynamically adapting strategies to meet evolving consumer preferences, with a particular emphasis on quality and health attributes.
Originality/value
This study contributes to the existing body of literature by providing insights into the profound transformations induced by the COVID-19 pandemic on consumer behaviour, specifically in relation to their preferences for private label food products.
Details
Keywords
Jahanzaib Alvi and Imtiaz Arif
The crux of this paper is to unveil efficient features and practical tools that can predict credit default.
Abstract
Purpose
The crux of this paper is to unveil efficient features and practical tools that can predict credit default.
Design/methodology/approach
Annual data of non-financial listed companies were taken from 2000 to 2020, along with 71 financial ratios. The dataset was bifurcated into three panels with three default assumptions. Logistic regression (LR) and k-nearest neighbor (KNN) binary classification algorithms were used to estimate credit default in this research.
Findings
The study’s findings revealed that features used in Model 3 (Case 3) were the efficient and best features comparatively. Results also showcased that KNN exposed higher accuracy than LR, which proves the supremacy of KNN on LR.
Research limitations/implications
Using only two classifiers limits this research for a comprehensive comparison of results; this research was based on only financial data, which exhibits a sizeable room for including non-financial parameters in default estimation. Both limitations may be a direction for future research in this domain.
Originality/value
This study introduces efficient features and tools for credit default prediction using financial data, demonstrating KNN’s superior accuracy over LR and suggesting future research directions.
Details