Search results

1 – 10 of 126
To view the access options for this content please click here
Article
Publication date: 11 February 2021

Krithiga R. and Ilavarasan E.

The purpose of this paper is to enhance the performance of spammer identification problem in online social networks. Hyperparameter tuning has been performed by…

Abstract

Purpose

The purpose of this paper is to enhance the performance of spammer identification problem in online social networks. Hyperparameter tuning has been performed by researchers in the past to enhance the performance of classifiers. The AdaBoost algorithm belongs to a class of ensemble classifiers and is widely applied in binary classification problems. A single algorithm may not yield accurate results. However, an ensemble of classifiers built from multiple models has been successfully applied to solve many classification tasks. The search space to find an optimal set of parametric values is vast and so enumerating all possible combinations is not feasible. Hence, a hybrid modified whale optimization algorithm for spam profile detection (MWOA-SPD) model is proposed to find optimal values for these parameters.

Design/methodology/approach

In this work, the hyperparameters of AdaBoost are fine-tuned to find its application to identify spammers in social networks. AdaBoost algorithm linearly combines several weak classifiers to produce a stronger one. The proposed MWOA-SPD model hybridizes the whale optimization algorithm and salp swarm algorithm.

Findings

The technique is applied to a manually constructed Twitter data set. It is compared with the existing optimization and hyperparameter tuning methods. The results indicate that the proposed method outperforms the existing techniques in terms of accuracy and computational efficiency.

Originality/value

The proposed method reduces the server load by excluding complex features retaining only the lightweight features. It aids in identifying the spammers at an earlier stage thereby offering users a propitious environment.

Details

International Journal of Pervasive Computing and Communications, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1742-7371

Keywords

To view the access options for this content please click here
Article
Publication date: 1 July 2020

Jiaming Liu, Liuan Wang, Linan Zhang, Zeming Zhang and Sicheng Zhang

The primary objective of this study was to recognize critical indicators in predicting blood glucose (BG) through data-driven methods and to compare the prediction…

Abstract

Purpose

The primary objective of this study was to recognize critical indicators in predicting blood glucose (BG) through data-driven methods and to compare the prediction performance of four tree-based ensemble models, i.e. bagging with tree regressors (bagging-decision tree [Bagging-DT]), AdaBoost with tree regressors (Adaboost-DT), random forest (RF) and gradient boosting decision tree (GBDT).

Design/methodology/approach

This study proposed a majority voting feature selection method by combining lasso regression with the Akaike information criterion (AIC) (LR-AIC), lasso regression with the Bayesian information criterion (BIC) (LR-BIC) and RF to select indicators with excellent predictive performance from initial 38 indicators in 5,642 samples. The selected features were deployed to build the tree-based ensemble models. The 10-fold cross-validation (CV) method was used to evaluate the performance of each ensemble model.

Findings

The results of feature selection indicated that age, corpuscular hemoglobin concentration (CHC), red blood cell volume distribution width (RBCVDW), red blood cell volume and leucocyte count are five most important clinical/physical indicators in BG prediction. Furthermore, this study also found that the GBDT ensemble model combined with the proposed majority voting feature selection method is better than other three models with respect to prediction performance and stability.

Practical implications

This study proposed a novel BG prediction framework for better predictive analytics in health care.

Social implications

This study incorporated medical background and machine learning technology to reduce diabetes morbidity and formulate precise medical schemes.

Originality/value

The majority voting feature selection method combined with the GBDT ensemble model provides an effective decision-making tool for predicting BG and detecting diabetes risk in advance.

To view the access options for this content please click here
Article
Publication date: 17 October 2008

Thiago Turchetti Maia, Antônio Pádua Braga and André F. de Carvalho

To create new hybrid algorithms that combine boosting and support vector machines to outperform other known algorithms in selected contexts of binary classification problems.

Abstract

Purpose

To create new hybrid algorithms that combine boosting and support vector machines to outperform other known algorithms in selected contexts of binary classification problems.

Design/methodology/approach

Support vector machines (SVM) are known in the literature to be one of the most efficient learning models for tackling classification problems. Boosting algorithms rely on other classification algorithms to produce different weak hypotheses which are later combined into a single strong hypothesis. In this work the authors combine boosting with support vector machines, namely the AdaBoost.M1 and sequential minimal optimization (SMO) algorithms, to create new hybrid algorithms that outperform standard SVMs in selected contexts. This is achieved by integration with different degrees of coupling, where the four algorithms proposed range from simple black‐box integration to modifications and mergers between AdaBoost.M1 and SMO components.

Findings

The results show that the proposed algorithms exhibited better performance for most problems experimented. It is possible to identify trends of behavior bound to specific properties of the problems solved, where one may hence apply the proposed algorithms in situations where it is known to succeed.

Research limitations/implications

New strategies for combining boosting and SVMs may be further developed using the principles introduced in this paper, possibly resulting in other algorithms with yet superior performance.

Practical implications

The hybrid algorithms proposed in this paper may be used in classification problems with properties that they are known to handle well, thus possibly offering better results than other known algorithms in the literature.

Originality/value

This paper introduces the concept of merging boosting and SVM training algorithms to obtain hybrid solutions with better performance than standard SVMs.

Details

Kybernetes, vol. 37 no. 9/10
Type: Research Article
ISSN: 0368-492X

Keywords

To view the access options for this content please click here
Article
Publication date: 18 November 2019

Qinghua Liu, Lu Sun, Alain Kornhauser, Jiahui Sun and Nick Sangwa

To realize classification of different pavements, a road roughness acquisition system design and an improved restricted Boltzmann machine deep neural network algorithm…

Abstract

Purpose

To realize classification of different pavements, a road roughness acquisition system design and an improved restricted Boltzmann machine deep neural network algorithm based on Adaboost Backward Propagation algorithm for road roughness detection is presented in this paper. The developed measurement system, including hardware designs and algorithm for software, constitutes an independent system which is low-cost, convenient for installation and small.

Design/methodology/approach

The inputs of restricted Boltzmann machine deep neural network are the vehicle vertical acceleration power spectrum and the pitch acceleration power spectrum, which is calculated using ADAMS finite element software. Adaboost Backward Propagation algorithm is used in each restricted Boltzmann machine deep neural network classification model for fine-tuning given its performance of global searching. The algorithm is first applied to road spectrum detection and experiments indicate that the algorithm is suitable for detecting pavement roughness.

Findings

The detection rate of RBM deep neural network algorithm based on Adaboost Backward Propagation is up to 96 per cent, and the false positive rate is below 3.34 per cent. These indices are both better than the other supervised algorithms, which also performs better in extracting the intrinsic characteristics of data, and therefore improves the classification accuracy and classification quality. Additionally, the classification performance is optimized. The experimental results show that the algorithm can improve performance of restricted Boltzmann machine deep neural networks. The system can be used for detecting pavement roughness.

Originality/value

This paper presents an improved restricted Boltzmann machine deep neural network algorithm based on Adaboost Backward Propagation for identifying the road roughness. Through the restricted Boltzmann machine, it completes pre-training and initializing sample weights. The entire neural network is fine-tuned through the Adaboost Backward Propagation algorithm, verifying the validity of the algorithm on the MNIST data set. A quarter vehicle model is used as the foundation, and the vertical acceleration spectrum of the vehicle center of mass and pitch acceleration spectrum were obtained by simulation in ADAMS as the input samples. The experimental results show that the improved algorithm has better optimization ability, improves the detection rate and can detect the road roughness more effectively.

To view the access options for this content please click here
Book part
Publication date: 11 September 2020

D. K. Malhotra, Kunal Malhotra and Rashmi Malhotra

Traditionally, loan officers use different credit scoring models to complement judgmental methods to classify consumer loan applications. This study explores the use of…

Abstract

Traditionally, loan officers use different credit scoring models to complement judgmental methods to classify consumer loan applications. This study explores the use of decision trees, AdaBoost, and support vector machines (SVMs) to identify potential bad loans. Our results show that AdaBoost does provide an improvement over simple decision trees as well as SVM models in predicting good credit clients and bad credit clients. To cross-validate our results, we use k-fold classification methodology.

To view the access options for this content please click here
Article
Publication date: 13 August 2018

Shrawan Kumar Trivedi and Prabin Kumar Panigrahi

Email spam classification is now becoming a challenging area in the domain of text classification. Precise and robust classifiers are not only judged by classification…

Abstract

Purpose

Email spam classification is now becoming a challenging area in the domain of text classification. Precise and robust classifiers are not only judged by classification accuracy but also by sensitivity (correctly classified legitimate emails) and specificity (correctly classified unsolicited emails) towards the accurate classification, captured by both false positive and false negative rates. This paper aims to present a comparative study between various decision tree classifiers (such as AD tree, decision stump and REP tree) with/without different boosting algorithms (bagging, boosting with re-sample and AdaBoost).

Design/methodology/approach

Artificial intelligence and text mining approaches have been incorporated in this study. Each decision tree classifier in this study is tested on informative words/features selected from the two publically available data sets (SpamAssassin and LingSpam) using a greedy step-wise feature search method.

Findings

Outcomes of this study show that without boosting, the REP tree provides high performance accuracy with the AD tree ranking as the second-best performer. Decision stump is found to be the under-performing classifier of this study. However, with boosting, the combination of REP tree and AdaBoost compares favourably with other classification models. If the metrics false positive rate and performance accuracy are taken together, AD tree and REP tree with AdaBoost were both found to carry out an effective classification task. Greedy stepwise has proven its worth in this study by selecting a subset of valuable features to identify the correct class of emails.

Research limitations/implications

This research is focussed on the classification of those email spams that are written in the English language only. The proposed models work with content (words/features) of email data that is mostly found in the body of the mail. Image spam has not been included in this study. Other messages such as short message service or multi-media messaging service were not included in this study.

Practical implications

In this research, a boosted decision tree approach has been proposed and used to classify email spam and ham files; this is found to be a highly effective approach in comparison with other state-of-the-art modes used in other studies. This classifier may be tested for different applications and may provide new insights for developers and researchers.

Originality/value

A comparison of decision tree classifiers with/without ensemble has been presented for spam classification.

Details

Journal of Systems and Information Technology, vol. 20 no. 3
Type: Research Article
ISSN: 1328-7265

Keywords

To view the access options for this content please click here
Article
Publication date: 16 January 2007

Pei Jia, Huosheng H. Hu, Tao Lu and Kui Yuan

This paper presents a novel hands‐free control system for intelligent wheelchairs (IWs) based on visual recognition of head gestures.

Abstract

Purpose

This paper presents a novel hands‐free control system for intelligent wheelchairs (IWs) based on visual recognition of head gestures.

Design/methodology/approach

A robust head gesture‐based interface (HGI), is designed for head gesture recognition of the RoboChair user. The recognised gestures are used to generate motion control commands to the low‐level DSP motion controller so that it can control the motion of the RoboChair according to the user's intention. Adaboost face detection algorithm and Camshift object tracking algorithm are combined in our system to achieve accurate face detection, tracking and gesture recognition in real time. It is intended to be used as a human‐friendly interface for elderly and disabled people to operate our intelligent wheelchair using their head gestures rather than their hands.

Findings

This is an extremely useful system for the users who have restricted limb movements caused by some diseases such as Parkinson's disease and quadriplegics.

Practical implications

In this paper, a novel integrated approach to real‐time face detection, tracking and gesture recognition is proposed, namely HGI.

Originality/value

It is an useful human‐robot interface for IWs.

Details

Industrial Robot: An International Journal, vol. 34 no. 1
Type: Research Article
ISSN: 0143-991X

Keywords

Content available
Article
Publication date: 1 October 2018

Min Wang, Shuguang Li, Lei Zhu and Jin Yao

Analysis of characteristic driving operations can help develop supports for drivers with different driving skills. However, the existing knowledge on analysis of driving…

Abstract

Purpose

Analysis of characteristic driving operations can help develop supports for drivers with different driving skills. However, the existing knowledge on analysis of driving skills only focuses on single driving operation and cannot reflect the differences on proficiency of coordination of driving operations. Thus, the purpose of this paper is to analyze driving skills from driving coordinating operations. There are two main contributions: the first involves a method for feature extraction based on AdaBoost, which selects features critical for coordinating operations of experienced drivers and inexperienced drivers, and the second involves a generating method for candidate features, called the combined features method, through which two or more different driving operations at the same location are combined into a candidate combined feature. A series of experiments based on driving simulator and specific course with several different curves were carried out, and the result indicated the feasibility of analyzing driving behavior through AdaBoost and the combined features method.

Design/methodology/approach

AdaBoost was used to extract features and the combined features method was used to combine two or more different driving operations at the same location.

Findings

A series of experiments based on driving simulator and specific course with several different curves were carried out, and the result indicated the feasibility of analyzing driving behavior through AdaBoost and the combined features method.

Originality/value

There are two main contributions: the first involves a method for feature extraction based on AdaBoost, which selects features critical for coordinating operations of experienced drivers and inexperienced drivers, and the second involves a generating method for candidate features, called the combined features method, through which two or more different driving operations at the same location are combined into a candidate combined feature.

Details

Journal of Intelligent and Connected Vehicles, vol. 1 no. 3
Type: Research Article
ISSN: 2399-9802

Keywords

To view the access options for this content please click here
Article
Publication date: 14 November 2016

Shrawan Kumar Trivedi and Shubhamoy Dey

The email is an important medium for sharing information rapidly. However, spam, being a nuisance in such communication, motivates the building of a robust filtering…

Abstract

Purpose

The email is an important medium for sharing information rapidly. However, spam, being a nuisance in such communication, motivates the building of a robust filtering system with high classification accuracy and good sensitivity towards false positives. In that context, this paper aims to present a combined classifier technique using a committee selection mechanism where the main objective is to identify a set of classifiers so that their individual decisions can be combined by a committee selection procedure for accurate detection of spam.

Design/methodology/approach

For training and testing of the relevant machine learning classifiers, text mining approaches are used in this research. Three data sets (Enron, SpamAssassin and LingSpam) have been used to test the classifiers. Initially, pre-processing is performed to extract the features associated with the email files. In the next step, the extracted features are taken through a dimensionality reduction method where non-informative features are removed. Subsequently, an informative feature subset is selected using genetic feature search. Thereafter, the proposed classifiers are tested on those informative features and the results compared with those of other classifiers.

Findings

For building the proposed combined classifier, three different studies have been performed. The first study identifies the effect of boosting algorithms on two probabilistic classifiers: Bayesian and Naïve Bayes. In that study, AdaBoost has been found to be the best algorithm for performance boosting. The second study was on the effect of different Kernel functions on support vector machine (SVM) classifier, where SVM with normalized polynomial (NP) kernel was observed to be the best. The last study was on combining classifiers with committee selection where the committee members were the best classifiers identified by the first study i.e. Bayesian and Naïve bays with AdaBoost, and the committee president was selected from the second study i.e. SVM with NP kernel. Results show that combining of the identified classifiers to form a committee machine gives excellent performance accuracy with a low false positive rate.

Research limitations/implications

This research is focused on the classification of email spams written in English language. Only body (text) parts of the emails have been used. Image spam has not been included in this work. We have restricted our work to only emails messages. None of the other types of messages like short message service or multi-media messaging service were a part of this study.

Practical implications

This research proposes a method of dealing with the issues and challenges faced by internet service providers and organizations that use email. The proposed model provides not only better classification accuracy but also a low false positive rate.

Originality/value

The proposed combined classifier is a novel classifier designed for accurate classification of email spam.

Details

VINE Journal of Information and Knowledge Management Systems, vol. 46 no. 4
Type: Research Article
ISSN: 2059-5891

Keywords

To view the access options for this content please click here
Article
Publication date: 5 March 2018

Lei La, Shuyan Cao and Liangjuan Qin

As a foundational issue of social mining, sentiment classification suffered from a lack of unlabeled data. To enhance accuracy of classification with few labeled data…

Abstract

Purpose

As a foundational issue of social mining, sentiment classification suffered from a lack of unlabeled data. To enhance accuracy of classification with few labeled data, many semi-supervised algorithms had been proposed. These algorithms improved the classification performance when the labeled data are insufficient. However, precision and efficiency are difficult to be ensured at the same time in many semi-supervised methods. This paper aims to present a novel method for using unlabeled data in a more accurate and more efficient way.

Design/methodology/approach

First, the authors designed a boosting-based method for unlabeled data selection. The improved boosting-based method can choose unlabeled data which have the same distribution with the labeled data. The authors then proposed a novel strategy which can combine weak classifiers into strong classifiers that are more rational. Finally, a semi-supervised sentiment classification algorithm is given.

Findings

Experimental results demonstrate that the novel algorithm can achieve really high accuracy with low time consumption. It is helpful for achieving high-performance social network-related applications.

Research limitations/implications

The novel method needs a small labeled data set for semi-supervised learning. Maybe someday the authors can improve it to an unsupervised method.

Practical implications

The mentioned method can be used in text mining, image classification, audio processing and so on, and also in an unstructured data mining-related field. Overcome the problem of insufficient labeled data and achieve high precision using fewer computational time.

Social implications

Sentiment mining has wide applications in public opinion management, public security, market analysis, social network and related fields. Sentiment classification is the basis of sentiment mining.

Originality/value

According to what the authors have been informed, it is the first time transfer learning be introduced to AdaBoost for semi-supervised learning. Moreover, the improved AdaBoost uses a totally new mechanism for weighting.

Details

Kybernetes, vol. 47 no. 3
Type: Research Article
ISSN: 0368-492X

Keywords

1 – 10 of 126