Search results
1 – 10 of over 1000Deepak Suresh Asudani, Naresh Kumar Nagwani and Pradeep Singh
Classifying emails as ham or spam based on their content is essential. Determining the semantic and syntactic meaning of words and putting them in a high-dimensional feature…
Abstract
Purpose
Classifying emails as ham or spam based on their content is essential. Determining the semantic and syntactic meaning of words and putting them in a high-dimensional feature vector form for processing is the most difficult challenge in email categorization. The purpose of this paper is to examine the effectiveness of the pre-trained embedding model for the classification of emails using deep learning classifiers such as the long short-term memory (LSTM) model and convolutional neural network (CNN) model.
Design/methodology/approach
In this paper, global vectors (GloVe) and Bidirectional Encoder Representations Transformers (BERT) pre-trained word embedding are used to identify relationships between words, which helps to classify emails into their relevant categories using machine learning and deep learning models. Two benchmark datasets, SpamAssassin and Enron, are used in the experimentation.
Findings
In the first set of experiments, machine learning classifiers, the support vector machine (SVM) model, perform better than other machine learning methodologies. The second set of experiments compares the deep learning model performance without embedding, GloVe and BERT embedding. The experiments show that GloVe embedding can be helpful for faster execution with better performance on large-sized datasets.
Originality/value
The experiment reveals that the CNN model with GloVe embedding gives slightly better accuracy than the model with BERT embedding and traditional machine learning algorithms to classify an email as ham or spam. It is concluded that the word embedding models improve email classifiers accuracy.
Details
Keywords
Faris Elghaish, Saeed Talebi, Essam Abdellatef, Sandra T. Matarneh, M. Reza Hosseini, Song Wu, Mohammad Mayouf, Aso Hajirasouli and The-Quan Nguyen
This paper aims to Test the capabilities/accuracies of four deep learning pre trained convolutional neural network (CNN) models to detect and classify types of highway cracks, as…
Abstract
Purpose
This paper aims to Test the capabilities/accuracies of four deep learning pre trained convolutional neural network (CNN) models to detect and classify types of highway cracks, as well as developing a new CNN model to maximize the accuracy at different learning rates.
Design/methodology/approach
A sample of 4,663 images of highway cracks were collected and classified into three categories of cracks, namely, “vertical cracks,” “horizontal and vertical cracks” and “diagonal cracks,” subsequently, using “Matlab” to classify the sample to training (70%) and testing (30%) to apply the four deep learning CNN models and compute their accuracies. After that, developing a new deep learning CNN model to maximize the accuracy of detecting and classifying highway cracks and testing the accuracy using three optimization algorithms at different learning rates.
Findings
The accuracies result of the four deep learning pre-trained models are above the averages between top-1 and top-5 and the accuracy of classifying and detecting the samples exceeded the top-5 accuracy for the pre-trained AlexNet model around 3% and by 0.2% for the GoogleNet model. The accurate model here is the GoogleNet model as the accuracy is 89.08% and it is higher than AlexNet by 1.26%. While the computed accuracy for the new created deep learning CNN model exceeded all pre-trained models by achieving 97.62% at a learning rate of 0.001 using Adam’s optimization algorithm.
Practical implications
The created deep learning CNN model will enable users (e.g. highway agencies) to scan a long highway and detect types of cracks accurately in a very short time compared to traditional approaches.
Originality/value
A new deep learning CNN-based highway cracks detection was developed based on testing four pre-trained CNN models and analyze the capabilities of each model to maximize the accuracy of the proposed CNN.
Details
Keywords
Faris Elghaish, Sandra Matarneh, Essam Abdellatef, Farzad Rahimian, M. Reza Hosseini and Ahmed Farouk Kineber
Cracks are prevalent signs of pavement distress found on highways globally. The use of artificial intelligence (AI) and deep learning (DL) for crack detection is increasingly…
Abstract
Purpose
Cracks are prevalent signs of pavement distress found on highways globally. The use of artificial intelligence (AI) and deep learning (DL) for crack detection is increasingly considered as an optimal solution. Consequently, this paper introduces a novel, fully connected, optimised convolutional neural network (CNN) model using feature selection algorithms for the purpose of detecting cracks in highway pavements.
Design/methodology/approach
To enhance the accuracy of the CNN model for crack detection, the authors employed a fully connected deep learning layers CNN model along with several optimisation techniques. Specifically, three optimisation algorithms, namely adaptive moment estimation (ADAM), stochastic gradient descent with momentum (SGDM), and RMSProp, were utilised to fine-tune the CNN model and enhance its overall performance. Subsequently, the authors implemented eight feature selection algorithms to further improve the accuracy of the optimised CNN model. These feature selection techniques were thoughtfully selected and systematically applied to identify the most relevant features contributing to crack detection in the given dataset. Finally, the authors subjected the proposed model to testing against seven pre-trained models.
Findings
The study's results show that the accuracy of the three optimisers (ADAM, SGDM, and RMSProp) with the five deep learning layers model is 97.4%, 98.2%, and 96.09%, respectively. Following this, eight feature selection algorithms were applied to the five deep learning layers to enhance accuracy, with particle swarm optimisation (PSO) achieving the highest F-score at 98.72. The model was then compared with other pre-trained models and exhibited the highest performance.
Practical implications
With an achieved precision of 98.19% and F-score of 98.72% using PSO, the developed model is highly accurate and effective in detecting and evaluating the condition of cracks in pavements. As a result, the model has the potential to significantly reduce the effort required for crack detection and evaluation.
Originality/value
The proposed method for enhancing CNN model accuracy in crack detection stands out for its unique combination of optimisation algorithms (ADAM, SGDM, and RMSProp) with systematic application of multiple feature selection techniques to identify relevant crack detection features and comparing results with existing pre-trained models.
Details
Keywords
Bachriah Fatwa Dhini, Abba Suganda Girsang, Unggul Utan Sufandi and Heny Kurniawati
The authors constructed an automatic essay scoring (AES) model in a discussion forum where the result was compared with scores given by human evaluators. This research proposes…
Abstract
Purpose
The authors constructed an automatic essay scoring (AES) model in a discussion forum where the result was compared with scores given by human evaluators. This research proposes essay scoring, which is conducted through two parameters, semantic and keyword similarities, using a SentenceTransformers pre-trained model that can construct the highest vector embedding. Combining these models is used to optimize the model with increasing accuracy.
Design/methodology/approach
The development of the model in the study is divided into seven stages: (1) data collection, (2) pre-processing data, (3) selected pre-trained SentenceTransformers model, (4) semantic similarity (sentence pair), (5) keyword similarity, (6) calculate final score and (7) evaluating model.
Findings
The multilingual paraphrase-multilingual-MiniLM-L12-v2 and distilbert-base-multilingual-cased-v1 models got the highest scores from comparisons of 11 pre-trained multilingual models of SentenceTransformers with Indonesian data (Dhini and Girsang, 2023). Both multilingual models were adopted in this study. A combination of two parameters is obtained by comparing the response of the keyword extraction responses with the rubric keywords. Based on the experimental results, proposing a combination can increase the evaluation results by 0.2.
Originality/value
This study uses discussion forum data from the general biology course in online learning at the open university for the 2020.2 and 2021.2 semesters. Forum discussion ratings are still manual. In this survey, the authors created a model that automatically calculates the value of discussion forums, which are essays based on the lecturer's answers moreover rubrics.
Details
Keywords
Adela Sobotkova, Ross Deans Kristensen-McLachlan, Orla Mallon and Shawn Adrian Ross
This paper provides practical advice for archaeologists and heritage specialists wishing to use ML approaches to identify archaeological features in high-resolution satellite…
Abstract
Purpose
This paper provides practical advice for archaeologists and heritage specialists wishing to use ML approaches to identify archaeological features in high-resolution satellite imagery (or other remotely sensed data sources). We seek to balance the disproportionately optimistic literature related to the application of ML to archaeological prospection through a discussion of limitations, challenges and other difficulties. We further seek to raise awareness among researchers of the time, effort, expertise and resources necessary to implement ML successfully, so that they can make an informed choice between ML and manual inspection approaches.
Design/methodology/approach
Automated object detection has been the holy grail of archaeological remote sensing for the last two decades. Machine learning (ML) models have proven able to detect uniform features across a consistent background, but more variegated imagery remains a challenge. We set out to detect burial mounds in satellite imagery from a diverse landscape in Central Bulgaria using a pre-trained Convolutional Neural Network (CNN) plus additional but low-touch training to improve performance. Training was accomplished using MOUND/NOT MOUND cutouts, and the model assessed arbitrary tiles of the same size from the image. Results were assessed using field data.
Findings
Validation of results against field data showed that self-reported success rates were misleadingly high, and that the model was misidentifying most features. Setting an identification threshold at 60% probability, and noting that we used an approach where the CNN assessed tiles of a fixed size, tile-based false negative rates were 95–96%, false positive rates were 87–95% of tagged tiles, while true positives were only 5–13%. Counterintuitively, the model provided with training data selected for highly visible mounds (rather than all mounds) performed worse. Development of the model, meanwhile, required approximately 135 person-hours of work.
Research limitations/implications
Our attempt to deploy a pre-trained CNN demonstrates the limitations of this approach when it is used to detect varied features of different sizes within a heterogeneous landscape that contains confounding natural and modern features, such as roads, forests and field boundaries. The model has detected incidental features rather than the mounds themselves, making external validation with field data an essential part of CNN workflows. Correcting the model would require refining the training data as well as adopting different approaches to model choice and execution, raising the computational requirements beyond the level of most cultural heritage practitioners.
Practical implications
Improving the pre-trained model’s performance would require considerable time and resources, on top of the time already invested. The degree of manual intervention required – particularly around the subsetting and annotation of training data – is so significant that it raises the question of whether it would be more efficient to identify all of the mounds manually, either through brute-force inspection by experts or by crowdsourcing the analysis to trained – or even untrained – volunteers. Researchers and heritage specialists seeking efficient methods for extracting features from remotely sensed data should weigh the costs and benefits of ML versus manual approaches carefully.
Social implications
Our literature review indicates that use of artificial intelligence (AI) and ML approaches to archaeological prospection have grown exponentially in the past decade, approaching adoption levels associated with “crossing the chasm” from innovators and early adopters to the majority of researchers. The literature itself, however, is overwhelmingly positive, reflecting some combination of publication bias and a rhetoric of unconditional success. This paper presents the failure of a good-faith attempt to utilise these approaches as a counterbalance and cautionary tale to potential adopters of the technology. Early-majority adopters may find ML difficult to implement effectively in real-life scenarios.
Originality/value
Unlike many high-profile reports from well-funded projects, our paper represents a serious but modestly resourced attempt to apply an ML approach to archaeological remote sensing, using techniques like transfer learning that are promoted as solutions to time and cost problems associated with, e.g. annotating and manipulating training data. While the majority of articles uncritically promote ML, or only discuss how challenges were overcome, our paper investigates how – despite reasonable self-reported scores – the model failed to locate the target features when compared to field data. We also present time, expertise and resourcing requirements, a rarity in ML-for-archaeology publications.
Details
Keywords
Classification of remote sensing images (RSI) is a challenging task in computer vision. Recently, researchers have proposed a variety of creative methods for automatic recognition…
Abstract
Purpose
Classification of remote sensing images (RSI) is a challenging task in computer vision. Recently, researchers have proposed a variety of creative methods for automatic recognition of RSI, and feature fusion is a research hotspot for its great potential to boost performance. However, RSI has a unique imaging condition and cluttered scenes with complicated backgrounds. This larger difference from nature images has made the previous feature fusion methods present insignificant performance improvements.
Design/methodology/approach
This work proposed a two-convolutional neural network (CNN) fusion method named main and branch CNN fusion network (MBC-Net) as an improved solution for classifying RSI. In detail, the MBC-Net employs an EfficientNet-B3 as its main CNN stream and an EfficientNet-B0 as a branch, named MC-B3 and BC-B0, respectively. In particular, MBC-Net includes a long-range derivation (LRD) module, which is specially designed to learn the dependence of different features. Meanwhile, MBC-Net also uses some unique ideas to tackle the problems coming from the two-CNN fusion and the inherent nature of RSI.
Findings
Extensive experiments on three RSI sets prove that MBC-Net outperforms the other 38 state-of-the-art (STOA) methods published from 2020 to 2023, with a noticeable increase in overall accuracy (OA) values. MBC-Net not only presents a 0.7% increased OA value on the most confusing NWPU set but also has 62% fewer parameters compared to the leading approach that ranks first in the literature.
Originality/value
MBC-Net is a more effective and efficient feature fusion approach compared to other STOA methods in the literature. Given the visualizations of grad class activation mapping (Grad-CAM), it reveals that MBC-Net can learn the long-range dependence of features that a single CNN cannot. Based on the tendency stochastic neighbor embedding (t-SNE) results, it demonstrates that the feature representation of MBC-Net is more effective than other methods. In addition, the ablation tests indicate that MBC-Net is effective and efficient for fusing features from two CNNs.
Details
Keywords
Diabetic retinopathy (DR) is one of the dangerous complications of diabetes. Its grade level must be tracked to manage its progress and to start the appropriate decision for…
Abstract
Purpose
Diabetic retinopathy (DR) is one of the dangerous complications of diabetes. Its grade level must be tracked to manage its progress and to start the appropriate decision for treatment in time. Effective automated methods for the detection of DR and the classification of its severity stage are necessary to reduce the burden on ophthalmologists and diagnostic contradictions among manual readers.
Design/methodology/approach
In this research, convolutional neural network (CNN) was used based on colored retinal fundus images for the detection of DR and classification of its stages. CNN can recognize sophisticated features on the retina and provides an automatic diagnosis. The pre-trained VGG-16 CNN model was applied using a transfer learning (TL) approach to utilize the already learned parameters in the detection.
Findings
By conducting different experiments set up with different severity groupings, the achieved results are promising. The best-achieved accuracies for 2-class, 3-class, 4-class and 5-class classifications are 86.5, 80.5, 63.5 and 73.7, respectively.
Originality/value
In this research, VGG-16 was used to detect and classify DR stages using the TL approach. Different combinations of classes were used in the classification of DR severity stages to illustrate the ability of the model to differentiate between the classes and verify the effect of these changes on the performance of the model.
Details
Keywords
Atif Mahmood, Amod Kumar Tiwari and Sanjay Kumar Singh
To develop and examine an efficient and reliable jujube grading model with reduced computational time, which could be utilized in the food processing and packaging industries to…
Abstract
Purpose
To develop and examine an efficient and reliable jujube grading model with reduced computational time, which could be utilized in the food processing and packaging industries to perform quick grading and pricing of jujube as well as for the other similar types of fruits.
Design/methodology/approach
The whole process begins with manual analysis and collection of four jujube grades from the jujube tree, in addition to this jujube image acquisition was performed utilizing MVS which is further followed by image pre-processing and augmentation tasks. Eventually, classification models (i.e. proposed model, from scratch and pre-trained VGG16 and AlexNet) were trained and validated over the original and augmented datasets to discriminate the jujube into maturity grades.
Findings
The highest success rates reported over the original and augmented datasets were 97.53% (i.e. error of 2.47%) and 99.44% (i.e. error of 0.56%) respectively using Adam optimizer and a learning rate of 0.003.
Research limitations/implications
The investigation relies upon a single view of the jujube image and the outer appearance of the jujube. In the future, multi-view image capturing system could be employed for the model training/validation.
Practical implications
Due to the vast functional derivatives of jujube, the identification of maturity grades of jujube is paramount in the fruit industry, functional food production industries and pharmaceutical industry. Therefore, the proposed model which is practically feasible and easy to implement could be utilized in such industries.
Originality/value
This research examines the performance of proposed CNN models for selected optimizer and learning rates for the grading of jujube maturity into four classes and compares them with the classical models to depict the sublime model in terms of accuracy, the number of parameters, epochs and computational time. After a thorough investigation of the models, it was discovered that the proposed model transcends both classical models in all aspects for both the original and augmented datasets utilizing Adam optimizer with learning rate of 0.003.
Naga Swetha R, Vimal K. Shrivastava and K. Parvathi
The mortality rate due to skin cancers has been increasing over the past decades. Early detection and treatment of skin cancers can save lives. However, due to visual resemblance…
Abstract
Purpose
The mortality rate due to skin cancers has been increasing over the past decades. Early detection and treatment of skin cancers can save lives. However, due to visual resemblance of normal skin and lesion and blurred lesion borders, skin cancer diagnosis has become a challenging task even for skilled dermatologists. Hence, the purpose of this study is to present an image-based automatic approach for multiclass skin lesion classification and compare the performance of various models.
Design/methodology/approach
In this paper, the authors have presented a multiclass skin lesion classification approach based on transfer learning of deep convolutional neural network. The following pre-trained models have been used: VGG16, VGG19, ResNet50, ResNet101, ResNet152, Xception, MobileNet and compared their performances on skin cancer classification.
Findings
The experiments have been performed on HAM10000 dataset, which contains 10,015 dermoscopic images of seven skin lesion classes. The categorical accuracy of 83.69%, Top2 accuracy of 91.48% and Top3 accuracy of 96.19% has been obtained.
Originality/value
Early detection and treatment of skin cancer can save millions of lives. This work demonstrates that the transfer learning can be an effective way to classify skin cancer images, providing adequate performance with less computational complexity.
Details
Keywords
Jiawei Liu, Zi Xiong, Yi Jiang, Yongqiang Ma, Wei Lu, Yong Huang and Qikai Cheng
Fine-tuning pre-trained language models (PLMs), e.g. SciBERT, generally require large numbers of annotated data to achieve state-of-the-art performance on a range of NLP tasks in…
Abstract
Purpose
Fine-tuning pre-trained language models (PLMs), e.g. SciBERT, generally require large numbers of annotated data to achieve state-of-the-art performance on a range of NLP tasks in the scientific domain. However, obtaining fine-tuning data for scientific NLP tasks is still challenging and expensive. In this paper, the authors propose the mix prompt tuning (MPT), which is a semi-supervised method aiming to alleviate the dependence on annotated data and improve the performance of multi-granularity academic function recognition tasks.
Design/methodology/approach
Specifically, the proposed method provides multi-perspective representations by combining manually designed prompt templates with automatically learned continuous prompt templates to help the given academic function recognition task take full advantage of knowledge in PLMs. Based on these prompt templates and the fine-tuned PLM, a large number of pseudo labels are assigned to the unlabelled examples. Finally, the authors further fine-tune the PLM using the pseudo training set. The authors evaluate the method on three academic function recognition tasks of different granularity including the citation function, the abstract sentence function and the keyword function, with data sets from the computer science domain and the biomedical domain.
Findings
Extensive experiments demonstrate the effectiveness of the method and statistically significant improvements against strong baselines. In particular, it achieves an average increase of 5% in Macro-F1 score compared with fine-tuning, and 6% in Macro-F1 score compared with other semi-supervised methods under low-resource settings.
Originality/value
In addition, MPT is a general method that can be easily applied to other low-resource scientific classification tasks.
Details