Search results

1 – 10 of 402
Article
Publication date: 4 June 2020

Moruf Akin Adebowale, Khin T. Lwin and M. A. Hossain

Phishing attacks have evolved in recent years due to high-tech-enabled economic growth worldwide. The rise in all types of fraud loss in 2019 has been attributed to the increase…

1381

Abstract

Purpose

Phishing attacks have evolved in recent years due to high-tech-enabled economic growth worldwide. The rise in all types of fraud loss in 2019 has been attributed to the increase in deception scams and impersonation, as well as to sophisticated online attacks such as phishing. The global impact of phishing attacks will continue to intensify, and thus, a more efficient phishing detection method is required to protect online user activities. To address this need, this study focussed on the design and development of a deep learning-based phishing detection solution that leveraged the universal resource locator and website content such as images, text and frames.

Design/methodology/approach

Deep learning techniques are efficient for natural language and image classification. In this study, the convolutional neural network (CNN) and the long short-term memory (LSTM) algorithm were used to build a hybrid classification model named the intelligent phishing detection system (IPDS). To build the proposed model, the CNN and LSTM classifier were trained by using 1m universal resource locators and over 10,000 images. Then, the sensitivity of the proposed model was determined by considering various factors such as the type of feature, number of misclassifications and split issues.

Findings

An extensive experimental analysis was conducted to evaluate and compare the effectiveness of the IPDS in detecting phishing web pages and phishing attacks when applied to large data sets. The results showed that the model achieved an accuracy rate of 93.28% and an average detection time of 25 s.

Originality/value

The hybrid approach using deep learning algorithm of both the CNN and LSTM methods was used in this research work. On the one hand, the combination of both CNN and LSTM was used to resolve the problem of a large data set and higher classifier prediction performance. Hence, combining the two methods leads to a better result with less training time for LSTM and CNN architecture, while using the image, frame and text features as a hybrid for our model detection. The hybrid features and IPDS classifier for phishing detection were the novelty of this study to the best of the authors' knowledge.

Details

Journal of Enterprise Information Management, vol. 36 no. 3
Type: Research Article
ISSN: 1741-0398

Keywords

Article
Publication date: 18 October 2018

Kalyan Nagaraj, Biplab Bhattacharjee, Amulyashree Sridhar and Sharvani GS

Phishing is one of the major threats affecting businesses worldwide in current times. Organizations and customers face the hazards arising out of phishing attacks because of…

Abstract

Purpose

Phishing is one of the major threats affecting businesses worldwide in current times. Organizations and customers face the hazards arising out of phishing attacks because of anonymous access to vulnerable details. Such attacks often result in substantial financial losses. Thus, there is a need for effective intrusion detection techniques to identify and possibly nullify the effects of phishing. Classifying phishing and non-phishing web content is a critical task in information security protocols, and full-proof mechanisms have yet to be implemented in practice. The purpose of the current study is to present an ensemble machine learning model for classifying phishing websites.

Design/methodology/approach

A publicly available data set comprising 10,068 instances of phishing and legitimate websites was used to build the classifier model. Feature extraction was performed by deploying a group of methods, and relevant features extracted were used for building the model. A twofold ensemble learner was developed by integrating results from random forest (RF) classifier, fed into a feedforward neural network (NN). Performance of the ensemble classifier was validated using k-fold cross-validation. The twofold ensemble learner was implemented as a user-friendly, interactive decision support system for classifying websites as phishing or legitimate ones.

Findings

Experimental simulations were performed to access and compare the performance of the ensemble classifiers. The statistical tests estimated that RF_NN model gave superior performance with an accuracy of 93.41 per cent and minimal mean squared error of 0.000026.

Research limitations/implications

The research data set used in this study is publically available and easy to analyze. Comparative analysis with other real-time data sets of recent origin must be performed to ensure generalization of the model against various security breaches. Different variants of phishing threats must be detected rather than focusing particularly toward phishing website detection.

Originality/value

The twofold ensemble model is not applied for classification of phishing websites in any previous studies as per the knowledge of authors.

Details

Journal of Systems and Information Technology, vol. 20 no. 3
Type: Research Article
ISSN: 1328-7265

Keywords

Article
Publication date: 6 June 2016

Oluyinka Aderemi Adewumi and Ayobami Andronicus Akinyelu

Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars has been lost by many companies and individuals…

Abstract

Purpose

Phishing is one of the major challenges faced by the world of e-commerce today. Thanks to phishing attacks, billions of dollars has been lost by many companies and individuals. The global impact of phishing attacks will continue to be on the increase and thus a more efficient phishing detection technique is required. The purpose of this paper is to investigate and report the use of a nature inspired based-machine learning (ML) approach in classification of phishing e-mails.

Design/methodology/approach

ML-based techniques have been shown to be efficient in detecting phishing attacks. In this paper, firefly algorithm (FFA) was integrated with support vector machine (SVM) with the primary aim of developing an improved phishing e-mail classifier (known as FFA_SVM), capable of accurately detecting new phishing patterns as they occur. From a data set consisting of 4,000 phishing and ham e-mails, a set of features, suitable for phishing e-mail detection, was extracted and used to construct the hybrid classifier.

Findings

The FFA_SVM was applied to a data set consisting of up to 4,000 phishing and ham e-mails. Simulation experiments were performed to evaluate and compared the performance of the classifier. The tests yielded a classification accuracy of 99.94 percent, false positive rate of 0.06 percent and false negative rate of 0.04 percent.

Originality/value

The hybrid algorithm has not been earlier apply, as in this work, to the classification and detection of phishing e-mail, to the best of the authors’ knowledge.

Details

Kybernetes, vol. 45 no. 6
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 7 November 2016

Mehdi Dadkhah, Shahaboddin Shamshirband and Ainuddin Wahid Abdul Wahab

This paper aims to present a hybrid approach based on classification algorithms that was capable of identifying different types of phishing pages. In this approach, after…

Abstract

Purpose

This paper aims to present a hybrid approach based on classification algorithms that was capable of identifying different types of phishing pages. In this approach, after eliminating features that do not play an important role in identifying phishing attacks and also after adding the technique of searching page title in the search engine, the capability of identifying journal phishing and phishing pages embedded in legal sites was added to the presented approach in this paper.

Design/methodology/approach

The hybrid approach of this paper for identifying phishing web sites is presented. This approach consists of four basic sections. The action of identifying phishing web sites and journal phishing attacks is performed via selecting two classification algorithms separately. To identify phishing attacks embedded in legal web sites also the method of page title searching is used and then the result is returned. To facilitate identifying phishing pages the black list approach is used along with the proposed approach so that the operation of identifying phishing web sites can be performed more accurately, and, finally, by using a decision table, it is judged that the intended web site is phishing or legal.

Findings

In this paper, a hybrid approach based on classification algorithms to identify phishing web sites is presented that has the ability to identify a new type of phishing attack known as journal phishing. The presented approach considers the most used features and adds new features to identify these attacks and to eliminate unused features in the identifying process of these attacks, does not have the problems of previous techniques and can identify journal phishing too.

Originality/value

The major advantage of this technique was considering all of the possible and effective features in identifying phishing attacks and eliminating unused features of previous techniques; also, this technique in comparison with other similar techniques has the ability of identifying journal phishing attacks and phishing pages embedded in legal sites.

Details

The Electronic Library, vol. 34 no. 6
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 23 November 2012

Swapan Purkait

Phishing is essentially a social engineering crime on the Web, whose rampant occurrences and technique advancements are posing big challenges for researchers in both academia and…

5978

Abstract

Purpose

Phishing is essentially a social engineering crime on the Web, whose rampant occurrences and technique advancements are posing big challenges for researchers in both academia and the industry. The purpose of this study is to examine the available phishing literatures and phishing countermeasures, to determine how research has evolved and advanced in terms of quantity, content and publication outlets. In addition to that, this paper aims to identify the important trends in phishing and its countermeasures and provides a view of the research gap that is still prevailing in this field of study.

Design/methodology/approach

This paper is a comprehensive literature review prepared after analysing 16 doctoral theses and 358 papers in this field of research. The papers were analyzed based on their research focus, empirical basis on phishing and proposed countermeasures.

Findings

The findings reveal that the current anti‐phishing approaches that have seen significant deployments over the internet can be classified into eight categories. Also, the different approaches proposed so far are all preventive in nature. A Phisher will mainly target the innocent consumers who happen to be the weakest link in the security chain and it was found through various usability studies that neither server‐side security indicators nor client‐side toolbars and warnings are successful in preventing vulnerable users from being deceived.

Originality/value

Educating the internet users about phishing, as well as the implementation and proper application of anti‐phishing measures, are critical steps in protecting the identities of online consumers against phishing attacks. Further research is required to evaluate the effectiveness of the available countermeasures against fresh phishing attacks. Also there is the need to find out the factors which influence internet user's ability to correctly identify phishing websites.

Details

Information Management & Computer Security, vol. 20 no. 5
Type: Research Article
ISSN: 0968-5227

Keywords

Article
Publication date: 12 November 2018

Gunikhan Sonowal and KS Kuppusamy

This paper aims to propose a model entitled MMSPhiD (multidimensional similarity metrics model for screen reader user to phishing detection) that amalgamates multiple approaches…

Abstract

Purpose

This paper aims to propose a model entitled MMSPhiD (multidimensional similarity metrics model for screen reader user to phishing detection) that amalgamates multiple approaches to detect phishing URLs.

Design/methodology/approach

The model consists of three major components: machine learning-based approach, typosquatting-based approach and phoneme-based approach. The major objectives of the proposed model are detecting phishing URL, typosquatting and phoneme-based domain and suggesting the legitimate domain which is targeted by attackers.

Findings

The result of the experiment shows that the MMSPhiD model can successfully detect phishing with 99.03 per cent accuracy. In addition, this paper has analyzed 20 leading domains from Alexa and identified 1,861 registered typosquatting and 543 phoneme-based domains.

Research limitations/implications

The proposed model has used machine learning with the list-based approach. Building and maintaining the list shall be a limitation.

Practical implication

The results of the experiments demonstrate that the model achieved higher performance due to the incorporation of multi-dimensional filters.

Social implications

In addition, this paper has incorporated the accessibility needs of persons with visual impairments and provides an accessible anti-phishing approach.

Originality/value

This paper assists persons with visual impairments on detection phoneme-based phishing domains.

Details

Information & Computer Security, vol. 26 no. 5
Type: Research Article
ISSN: 2056-4961

Keywords

Article
Publication date: 10 January 2020

Ammara Zamir, Hikmat Ullah Khan, Tassawar Iqbal, Nazish Yousaf, Farah Aslam, Almas Anjum and Maryam Hamdani

This paper aims to present a framework to detect phishing websites using stacking model. Phishing is a type of fraud to access users’ credentials. The attackers access users’…

3152

Abstract

Purpose

This paper aims to present a framework to detect phishing websites using stacking model. Phishing is a type of fraud to access users’ credentials. The attackers access users’ personal and sensitive information for monetary purposes. Phishing affects diverse fields, such as e-commerce, online business, banking and digital marketing, and is ordinarily carried out by sending spam emails and developing identical websites resembling the original websites. As people surf the targeted website, the phishers hijack their personal information.

Design/methodology/approach

Features of phishing data set are analysed by using feature selection techniques including information gain, gain ratio, Relief-F and recursive feature elimination (RFE) for feature selection. Two features are proposed combining the strongest and weakest attributes. Principal component analysis with diverse machine learning algorithms including (random forest [RF], neural network [NN], bagging, support vector machine, Naïve Bayes and k-nearest neighbour) is applied on proposed and remaining features. Afterwards, two stacking models: Stacking1 (RF + NN + Bagging) and Stacking2 (kNN + RF + Bagging) are applied by combining highest scoring classifiers to improve the classification accuracy.

Findings

The proposed features played an important role in improving the accuracy of all the classifiers. The results show that RFE plays an important role to remove the least important feature from the data set. Furthermore, Stacking1 (RF + NN + Bagging) outperformed all other classifiers in terms of classification accuracy to detect phishing website with 97.4% accuracy.

Originality/value

This research is novel in this regard that no previous research focusses on using feed forward NN and ensemble learners for detecting phishing websites.

Article
Publication date: 21 July 2020

Arshey M. and Angel Viji K. S.

Phishing is a serious cybersecurity problem, which is widely available through multimedia, such as e-mail and Short Messaging Service (SMS) to collect the personal information of…

Abstract

Purpose

Phishing is a serious cybersecurity problem, which is widely available through multimedia, such as e-mail and Short Messaging Service (SMS) to collect the personal information of the individual. However, the rapid growth of the unsolicited and unwanted information needs to be addressed, raising the necessity of the technology to develop any effective anti-phishing methods.

Design/methodology/approach

The primary intention of this research is to design and develop an approach for preventing phishing by proposing an optimization algorithm. The proposed approach involves four steps, namely preprocessing, feature extraction, feature selection and classification, for dealing with phishing e-mails. Initially, the input data set is subjected to the preprocessing, which removes stop words and stemming in the data and the preprocessed output is given to the feature extraction process. By extracting keyword frequency from the preprocessed, the important words are selected as the features. Then, the feature selection process is carried out using the Bhattacharya distance such that only the significant features that can aid the classification are selected. Using the selected features, the classification is done using the deep belief network (DBN) that is trained using the proposed fractional-earthworm optimization algorithm (EWA). The proposed fractional-EWA is designed by the integration of EWA and fractional calculus to determine the weights in the DBN optimally.

Findings

The accuracy of the methods, naive Bayes (NB), DBN, neural network (NN), EWA-DBN and fractional EWA-DBN is 0.5333, 0.5455, 0.5556, 0.5714 and 0.8571, respectively. The sensitivity of the methods, NB, DBN, NN, EWA-DBN and fractional EWA-DBN is 0.4558, 0.5631, 0.7035, 0.7045 and 0.8182, respectively. Likewise, the specificity of the methods, NB, DBN, NN, EWA-DBN and fractional EWA-DBN is 0.5052, 0.5631, 0.7028, 0.7040 and 0.8800, respectively. It is clear from the comparative table that the proposed method acquired the maximal accuracy, sensitivity and specificity compared with the existing methods.

Originality/value

The e-mail phishing detection is performed in this paper using the optimization-based deep learning networks. The e-mails include a number of unwanted messages that are to be detected in order to avoid the storage issues. The importance of the method is that the inclusion of the historical data in the detection process enhances the accuracy of detection.

Details

Data Technologies and Applications, vol. 54 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 2 May 2023

Tianhao Xu and Prashanth Rajivan

Distinguishing phishing emails from legitimate emails continues to be a difficult task for most individuals. This study aims to investigate the psycholinguistic factors associated…

Abstract

Purpose

Distinguishing phishing emails from legitimate emails continues to be a difficult task for most individuals. This study aims to investigate the psycholinguistic factors associated with deception in phishing email text and their effect on end-user ability to discriminate phishing emails from legitimate emails.

Design/methodology/approach

Email messages and end-user decisions collected from a laboratory phishing study were validated and analyzed using natural language processing methods (Linguistic Inquiry Word Count) and penalized regression models (LASSO and Elastic Net) to determine the linguistic dimensions that attackers may use in phishing emails to deceive end-users and measure the impact of such choices on end-user susceptibility to phishing.

Findings

We found that most participants, who played the role of a phisher in the study, chose to deceive their end-user targets by pretending to be a familiar individual and presenting time pressure or deadlines. Results show that use of words conveying certainty (e.g. always, never) and work-related features in the phishing messages predicted higher end-user vulnerability. On the contrary, use of words that convey achievement (e.g. earn, win) or reward (cash, money) in the phishing messages predicted lower end-user vulnerability because such features are usually observed in scam-like messages.

Practical implications

Insights from this research show that analyzing emails for psycholinguistic features associated with computer-mediated deception could be used to fine-tune and improve spam and phishing detection technologies. This research also informs the kinds of phishing attacks that must be prioritized in antiphishing training programs.

Originality/value

Applying natural language processing and statistical modeling methods to analyze results from a laboratory phishing experiment to understand deception from both attacker and end-user is novel. Furthermore, results from this work advance our understanding of the linguistic factors associated with deception in phishing email text and its impact on end-user susceptibility.

Details

Information & Computer Security, vol. 31 no. 2
Type: Research Article
ISSN: 2056-4961

Keywords

Article
Publication date: 27 November 2020

Chaoqun Wang, Zhongyi Hu, Raymond Chiong, Yukun Bao and Jiang Wu

The aim of this study is to propose an efficient rule extraction and integration approach for identifying phishing websites. The proposed approach can elucidate patterns of…

Abstract

Purpose

The aim of this study is to propose an efficient rule extraction and integration approach for identifying phishing websites. The proposed approach can elucidate patterns of phishing websites and identify them accurately.

Design/methodology/approach

Hyperlink indicators along with URL-based features are used to build the identification model. In the proposed approach, very simple rules are first extracted based on individual features to provide meaningful and easy-to-understand rules. Then, the F-measure score is used to select high-quality rules for identifying phishing websites. To construct a reliable and promising phishing website identification model, the selected rules are integrated using a simple neural network model.

Findings

Experiments conducted using self-collected and benchmark data sets show that the proposed approach outperforms 16 commonly used classifiers (including seven non–rule-based and four rule-based classifiers as well as five deep learning models) in terms of interpretability and identification performance.

Originality/value

Investigating patterns of phishing websites based on hyperlink indicators using the efficient rule-based approach is innovative. It is not only helpful for identifying phishing websites, but also beneficial for extracting simple and understandable rules.

Details

The Electronic Library , vol. 38 no. 5/6
Type: Research Article
ISSN: 0264-0473

Keywords

1 – 10 of 402