Search results

1 – 10 of 134

View access options

Article

Publication date: 13 July 2021

Cyberbullying detection from tweets using deep learning

Shubham Bharti, Arun Kumar Yadav, Mohit Kumar and Divakar Yadav

With the rise of social media platforms, an increasing number of cases of cyberbullying has reemerged. Every day, large number of people, especially teenagers, become the victim…

HTML

PDF (631 KB)

Downloads

867

Abstract

Purpose

With the rise of social media platforms, an increasing number of cases of cyberbullying has reemerged. Every day, large number of people, especially teenagers, become the victim of cyber abuse. A cyberbullied person can have a long-lasting impact on his mind. Due to it, the victim may develop social anxiety, engage in self-harm, go into depression or in the extreme cases, it may lead to suicide. This paper aims to evaluate various techniques to automatically detect cyberbullying from tweets by using machine learning and deep learning approaches.

Design/methodology/approach

The authors applied machine learning algorithms approach and after analyzing the experimental results, the authors postulated that deep learning algorithms perform better for the task. Word-embedding techniques were used for word representation for our model training. Pre-trained embedding GloVe was used to generate word embedding. Different versions of GloVe were used and their performance was compared. Bi-directional long short-term memory (BLSTM) was used for classification.

Findings

The dataset contains 35,787 labeled tweets. The GloVe840 word embedding technique along with BLSTM provided the best results on the dataset with an accuracy, precision and F1 measure of 92.60%, 96.60% and 94.20%, respectively.

Research limitations/implications

If a word is not present in pre-trained embedding (GloVe), it may be given a random vector representation that may not correspond to the actual meaning of the word. It means that if a word is out of vocabulary (OOV) then it may not be represented suitably which can affect the detection of cyberbullying tweets. The problem may be rectified through the use of character level embedding of words.

Practical implications

The findings of the work may inspire entrepreneurs to leverage the proposed approach to build deployable systems to detect cyberbullying in different contexts such as workplace, school, etc and may also draw the attention of lawmakers and policymakers to create systemic tools to tackle the ills of cyberbullying.

Social implications

Cyberbullying, if effectively detected may save the victims from various psychological problems which, in turn, may lead society to a healthier and more productive life.

Originality/value

The proposed method produced results that outperform the state-of-the-art approaches in detecting cyberbullying from tweets. It uses a large dataset, created by intelligently merging two publicly available datasets. Further, a comprehensive evaluation of the proposed methodology has been presented.

Details

Kybernetes, vol. 51 no. 9

Type: Research Article

DOI:

ISSN: 0368-492X

Keywords

View access options

Article

Publication date: 4 August 2020

Detecting hate speech against politicians in Arabic community on social media

Imane Guellil, Ahsan Adeel, Faical Azouaou, Sara Chennoufi, Hanene Maafi and Thinhinane Hamitouche

This paper aims to propose an approach for hate speech detection against politicians in Arabic community on social media (e.g. Youtube). In the literature, similar works have been…

HTML

PDF (670 KB)

Downloads

386

Abstract

Purpose

This paper aims to propose an approach for hate speech detection against politicians in Arabic community on social media (e.g. Youtube). In the literature, similar works have been presented for other languages such as English. However, to the best of the authors’ knowledge, not much work has been conducted in the Arabic language.

Design/methodology/approach

This approach uses both classical algorithms of classification and deep learning algorithms. For the classical algorithms, the authors use Gaussian NB (GNB), Logistic Regression (LR), Random Forest (RF), SGD Classifier (SGD) and Linear SVC (LSVC). For the deep learning classification, four different algorithms (convolutional neural network (CNN), multilayer perceptron (MLP), long- or short-term memory (LSTM) and bi-directional long- or short-term memory (Bi-LSTM) are applied. For extracting features, the authors use both Word2vec and FastText with their two implementations, namely, Skip Gram (SG) and Continuous Bag of Word (CBOW).

Findings

Simulation results demonstrate the best performance of LSVC, BiLSTM and MLP achieving an accuracy up to 91%, when it is associated to SG model. The results are also shown that the classification that has been done on balanced corpus are more accurate than those done on unbalanced corpus.

Originality/value

The principal originality of this paper is to construct a new hate speech corpus (Arabic_fr_en) which was annotated by three different annotators. This corpus contains the three languages used by Arabic people being Arabic, French and English. For Arabic, the corpus contains both script Arabic and Arabizi (i.e. Arabic words written with Latin letters). Another originality is to rely on both shallow and deep leaning classification by using different model for extraction features such as Word2vec and FastText with their two implementation SG and CBOW.

Details

International Journal of Web Information Systems, vol. 16 no. 3

Type: Research Article

DOI:

ISSN: 1744-0084

Keywords

Arabic hate speech

View access options

Article

Publication date: 3 November 2020

Hate speech detection in Twitter using hybrid embeddings and improved cuckoo search-based neural networks

Femi Emmanuel Ayo, Olusegun Folorunso, Friday Thomas Ibharalu and Idowu Ademola Osinuga

Hate speech is an expression of intense hatred. Twitter has become a popular analytical tool for the prediction and monitoring of abusive behaviors. Hate speech detection with…

HTML

PDF (2.1 MB)

Downloads

478

Abstract

Purpose

Hate speech is an expression of intense hatred. Twitter has become a popular analytical tool for the prediction and monitoring of abusive behaviors. Hate speech detection with social media data has witnessed special research attention in recent studies, hence, the need to design a generic metadata architecture and efficient feature extraction technique to enhance hate speech detection.

Design/methodology/approach

This study proposes a hybrid embeddings enhanced with a topic inference method and an improved cuckoo search neural network for hate speech detection in Twitter data. The proposed method uses a hybrid embeddings technique that includes Term Frequency-Inverse Document Frequency (TF-IDF) for word-level feature extraction and Long Short Term Memory (LSTM) which is a variant of recurrent neural networks architecture for sentence-level feature extraction. The extracted features from the hybrid embeddings then serve as input into the improved cuckoo search neural network for the prediction of a tweet as hate speech, offensive language or neither.

Findings

The proposed method showed better results when tested on the collected Twitter datasets compared to other related methods. In order to validate the performances of the proposed method, t-test and post hoc multiple comparisons were used to compare the significance and means of the proposed method with other related methods for hate speech detection. Furthermore, Paired Sample t-Test was also conducted to validate the performances of the proposed method with other related methods.

Research limitations/implications

Finally, the evaluation results showed that the proposed method outperforms other related methods with mean F1-score of 91.3.

Originality/value

The main novelty of this study is the use of an automatic topic spotting measure based on naïve Bayes model to improve features representation.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 13 no. 4

Type: Research Article

DOI:

ISSN: 1756-378X

Keywords

View access options

Article

Publication date: 28 May 2021

Transformer network-based word embeddings approach for autonomous cyberbullying detection

Subbaraju Pericherla and E. Ilavarasan

Nowadays people are connected by social media like Facebook, Instagram, Twitter, YouTube and much more. Bullies take advantage of these social networks to share their comments…

HTML

PDF (1.6 MB)

Downloads

267

Abstract

Purpose

Nowadays people are connected by social media like Facebook, Instagram, Twitter, YouTube and much more. Bullies take advantage of these social networks to share their comments. Cyberbullying is one typical kind of harassment by making aggressive comments, abuses to hurt the netizens. Social media is one of the areas where bullying happens extensively. Hence, it is necessary to develop an efficient and autonomous cyberbullying detection technique.

Design/methodology/approach

In this paper, the authors proposed a transformer network-based word embeddings approach for cyberbullying detection. RoBERTa is used to generate word embeddings and Light Gradient Boosting Machine is used as a classifier.

Findings

The proposed approach outperforms machine learning algorithms such as logistic regression, support vector machine and deep learning models such as word-level convolutional neural networks (word CNN) and character convolutional neural networks with short cuts (char CNNS) in terms of precision, recall, F1-score.

Originality/value

One of the limitations of traditional word embeddings methods is context-independent. In this work, only text data are utilized to identify cyberbullying. This work can be extended to predict cyberbullying activities in multimedia environment like image, audio and video.

Details

International Journal of Intelligent Unmanned Systems, vol. 12 no. 1

Type: Research Article

DOI:

ISSN: 2049-6427

Keywords

View access options

Article

Publication date: 7 June 2021

Cyberbullying and cyber-mobbing in developing countries

Aliya Kintonova, Alexander Vasyaev and Viktor Shestak

This paper aims to consider modern internet phenomena such as cyberbullying and cybermobbing. The emphasis in the paper is placed on the problematic issues of the legal practice…

HTML

PDF (597 KB)

Downloads

1254

Abstract

Purpose

This paper aims to consider modern internet phenomena such as cyberbullying and cybermobbing. The emphasis in the paper is placed on the problematic issues of the legal practice of combating cyberbullying and cyber-mobbing in developing countries as these phenomena are still insufficiently studied. The subject of this paper is modern internet phenomena such as cyberbullying and cyber-mobbing. The emphasis in the paper is placed on the problematic issues of the legal practice of combating cyberbullying and cyber-mobbing in developing countries as these phenomena are still insufficiently studied.

Design/methodology/approach

The legislation of developing countries is compared with doctrinal and practical developments in the fight against the studied problem in developed countries of the West, as well as countries of the former USSR. Moreover, experiment was conducted to determine the effectiveness of methods to combat cyberbullying using social networks. Thus, 40 random accounts of people (presumably from 18 to 30 years old) were analyzed.

Findings

This paper indicates the concepts of cyber-mobbing and cyberbullying, as well as their varieties that exist in the modern world. This study examines statistical data, programs and measures of different states in the fight against cyberbullying and cyber-mobbing. Results of experiments showed that Instagram users are aware of the availability of built-in extensions of the social network to protect against cyberbullying and use them relatively frequently. With that, female segment of Instagram users is more concerned about the content of the comments under their photos than the male one.

Originality/value

Measures have been developed to prevent and counteract cyberbullying and cyber-mobbing, the introduction of which into the policies of states might help in the fight against these social phenomena.

Details

Information & Computer Security, vol. 29 no. 3

Type: Research Article

DOI:

ISSN: 2056-4961

Keywords

View access options

Article

Publication date: 4 October 2019

Improving the affective analysis in texts: Automatic method to detect affective intensity in lexicons based on Plutchik’s wheel of emotions

Carlos Molina Beltrán, Alejandra Andrea Segura Navarrete, Christian Vidal-Castro, Clemente Rubio-Manzano and Claudia Martínez-Araneda

This paper aims to propose a method for automatically labelling an affective lexicon with intensity values by using the WordNet Similarity (WS) software package with the purpose…

HTML

PDF (3.6 MB)

Downloads

673

Abstract

Purpose

This paper aims to propose a method for automatically labelling an affective lexicon with intensity values by using the WordNet Similarity (WS) software package with the purpose of improving the results of an affective analysis process, which is relevant to interpreting the textual information that is available in social networks. The hypothesis states that it is possible to improve affective analysis by using a lexicon that is enriched with the intensity values obtained from similarity metrics. Encouraging results were obtained when an affective analysis based on a labelled lexicon was compared with that based on another lexicon without intensity values.

Design/methodology/approach

The authors propose a method for the automatic extraction of the affective intensity values of words using the similarity metrics implemented in WS. First, the intensity values were calculated for words having an affective root in WordNet. Then, to evaluate the effectiveness of the proposal, the results of the affective analysis based on a labelled lexicon were compared to the results of an analysis with and without affective intensity values.

Findings

The main contribution of this research is a method for the automatic extraction of the intensity values of affective words used to enrich a lexicon compared with the manual labelling process. The results obtained from the affective analysis with the new lexicon are encouraging, as they provide a better performance than those achieved using a lexicon without affective intensity values.

Research limitations/implications

Given the restrictions for calculating the similarity between two words, the lexicon labelled with intensity values is a subset of the original lexicon, which means that a large proportion of the words in the corpus are not labelled in the new lexicon.

Practical implications

The practical implications of this work include providing tools to improve the analysis of the feelings of the users of social networks. In particular, it is of interest to provide an affective lexicon that improves attempts to solve the problems of a digital society, such as the detection of cyberbullying. In this case, by achieving greater precision in the detection of emotions, it is possible to detect the roles of participants in a situation of cyberbullying, for example, the bully and victim. Other problems in which the application of affective lexicons is of importance are the detection of aggressiveness against women or gender violence or the detection of depressive states in young people and children.

Social implications

This work is interested in providing an affective lexicon that improves attempts to solve the problems of a digital society, such as the detection of cyberbullying. In this case, by achieving greater precision in the detection of emotions, it is possible to detect the roles of participants in a situation of cyber bullying, for example, the bully and victim. Other problems in which the application of affective lexicons is of importance are the detection of aggressiveness against women or gender violence or the detection of depressive states in young people and children.

Originality/value

The originality of the research lies in the proposed method for automatically labelling the words of an affective lexicon with intensity values by using WS. To date, a lexicon labelled with intensity values has been constructed using the opinions of experts, but that method is more expensive and requires more time than other existing methods. On the other hand, the new method developed herein is applicable to larger lexicons, requires less time and facilitates automatic updating.

Details

The Electronic Library, vol. 37 no. 6

Type: Research Article

DOI:

ISSN: 0264-0473

Keywords

View access options

Article

Publication date: 21 April 2023

Cyberbullying in higher education: a review of the literature based on bibliometric analysis

Muhammad Ashraf Fauzi

The purpose of this study is to review cyberbullying incidents among students in higher education institutions (HEIs). Cyberbullying has become a threat to students' wellbeing as…

HTML

PDF (3.3 MB)

Downloads

408

Abstract

Purpose

The purpose of this study is to review cyberbullying incidents among students in higher education institutions (HEIs). Cyberbullying has become a threat to students' wellbeing as it penetrates one life due to the pervasive availability of digital technologies.

Design/methodology/approach

Through a bibliometric analysis, this study analyzes 361 journal publications from the Web of Science (WoS) based on bibliographic coupling and co-word analysis.

Findings

Significant themes were found related to cyberbullying in HEIs, particularly related to the impact and determinants of cyberbullying on students. Bibliographic coupling produces three clusters on the current research fronts, while co-word analysis produces four clusters on the prediction of future trends. Implications of this phenomenon warrant comprehensive intervention by the HEIs management to dampen its impact on students' wellbeing. Findings would enhance the fundamental understanding through science mapping on the prevalent and potential incidence of cyberbullying.

Practical implications

Crucial insights will benefit the government, HEIs’ management, educators, scholars, policymakers and parents to overcome this dreadful phenomenon of cyberbullying. Several managerial interventions and mitigation strategies are proposed to reduce and control the occurrence of cyberbullying.

Originality/value

This study presents a bibliometric review to uncover the knowledge structure of cyberbullying studies in HEIs.

Details

Kybernetes, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 0368-492X

Keywords

View access options

Article

Publication date: 8 February 2021

A novel approach to the creation of a labelling lexicon for improving emotion analysis in text

Alejandra Segura Navarrete, Claudia Martinez-Araneda, Christian Vidal-Castro and Clemente Rubio-Manzano

This paper aims to describe the process used to create an emotion lexicon enriched with the emotional intensity of words and focuses on improving the emotion analysis process in…

HTML

PDF (1.1 MB)

Downloads

284

Abstract

Purpose

This paper aims to describe the process used to create an emotion lexicon enriched with the emotional intensity of words and focuses on improving the emotion analysis process in texts.

Design/methodology/approach

The process includes setting, preparation and labelling stages. In the first stage, a lexicon is selected. It must include a translation to the target language and labelling according to Plutchik’s eight emotions. The second stage starts with the validation of the translations. Then, it is expanded with the synonyms of the emotion synsets of each word. In the labelling stage, the similarity of words is calculated and displayed using WordNet similarity.

Findings

The authors’ approach shows better performance to identification of the predominant emotion for the selected corpus. The most relevant is the improvement obtained in the results of the emotion analysis in a hybrid approach compared to the results obtained in a purist approach.

Research limitations/implications

The proposed lexicon can still be enriched by incorporating elements such as emojis, idioms and colloquial expressions.

Practical implications

This work is part of a research project that aids in solving problems in a digital society, such as detecting cyberbullying, abusive language and gender violence in texts or exercising parental control. Detection of depressive states in young people and children is added.

Originality/value

This semi-automatic process can be applied to any language to generate an emotion lexicon. This resource will be available in a software tool that implements a crowdsourcing strategy allowing the intensity to be re-labelled and new words to be automatically incorporated into the lexicon.

Details

The Electronic Library , vol. 39 no. 1

Type: Research Article

DOI:

ISSN: 0264-0473

Keywords

View access options

Article

Publication date: 9 June 2023

Perceived challenges affecting user engagement in online community: an analysis of interrelationships and interaction

Anuradha Yadav, Rajesh Kumar Singh, Ruchi Mishra and Surajit Bag

With gaining popularity, online communities are increasing. It is leading to the data and information overflow. So, there are some challenges like cyber frauds, cyberbullying…

HTML

PDF (963 KB)

Downloads

176

Abstract

Purpose

With gaining popularity, online communities are increasing. It is leading to the data and information overflow. So, there are some challenges like cyber frauds, cyberbullying, etc. while engaging with online communities. Not only this, anonymity of the participants, stress and racism are also big challenges in online communities' interaction. Online harassers' attack tactics have changed over time. In addition, there are challenges like quality of discussion, inequality in participation of the users, etc. may scale online communities towards incitement and activism. Therefore, this study will try to analyse these challenges for overall benefit of the society.

Design/methodology/approach

The underlying fuzzy set theory is employed to handle the fuzziness of users' perceptions since the attributes are expressed in linguistic preferences. Through exhaustive literature review, the authors have identified 15 challenges. These challenges are further categorised as cause and effect by using DEMATEL (Decision-Making Trial and Evaluation Laboratory) approach.

Findings

Lack of strategic planning and uninspired discussions between users has emerged as a major challenge in cause category. This study further demonstrates how individual challenge can be managed and developed to navigate the online communities to maintain a healthy environment in society.

Research limitations/implications

Results are based on limited dataset. Therefore, findings cannot be generalised for all online communities.

Originality/value

The research findings offer a suitable direction to policymakers to formulate and design policies, laws and regulations to increase user engagement in the online community. The study is beneficial to firms and researchers in understanding the factors influencing effective management of online communities.

Details

Benchmarking: An International Journal, vol. ahead-of-print no. ahead-of-print

Type: Research Article

DOI:

ISSN: 1463-5771

Keywords

View access options

Article

Publication date: 30 October 2018

A comparative study of the effectiveness of sentiment tools and human coding in sarcasm detection

Phoey Lee Teh, Pei Boon Ooi, Nee Nee Chan and Yee Kang Chuah

Sarcasm is often used in everyday speech and writing and is prevalent in online contexts. The purpose of this paper is to investigate the analogy between sarcasm comments from…

HTML

PDF (330 KB)

Downloads

292

Abstract

Purpose

Sarcasm is often used in everyday speech and writing and is prevalent in online contexts. The purpose of this paper is to investigate the analogy between sarcasm comments from sentiment tools and the human coder.

Design/methodology/approach

Using the Verbal Irony Procedure, eight human coders were engaged to analyse comments collected from an online commercial page, and a dissimilarity analysis was conducted with sentiment tools. Three constants were tested, namely, polarity from sentiment tools, polarity rating by human coders; and sarcasm-level ratings by human coders.

Findings

Results found an inconsistent ratio between these three constants. Sentiment tools used did not have the capability or reliability to detect the subtle, contextualized meanings of sarcasm statements that human coders could detect. Further research is required to refine the sentiment tools to enhance their sensitivity and capability.

Practical implications

With these findings, it is recommended that further research and commercialization efforts be directed at improving current sentiment tools – for example, to incorporate sophisticated human sarcasm texts in their analytical systems. Sarcasm exists frequently in media, politics and human forms of communications in society. Therefore, more highly sophisticated sentiment tools with the abilities to detect human sarcasm would be vital in research and industry.

Social implications

The findings suggest that presently, of the sentiment tools investigated, most are still unable to pick up subtle contexts within the text which can reverse or change the message that the writer intends to send to his/her receiver. Hence, the use of the relevant hashtags (e.g. #sarcasm; #irony) are of fundamental importance in detection tools. This would aid the evaluation of product reviews online for commercial usage.

Originality/value

The value of this study lies in its original, empirical findings on the inconsistencies between sentiment tools and human coders in sarcasm detection. The current study proves these inconsistencies are detected between human and sentiment tools in social media texts and points to the inadequacies of current sentiment tools. With these findings, it is recommended that further research and commercialization efforts be directed at improving current sentiment tools – to incorporate sophisticated human sarcasm texts in their analytical systems. The system can then be used as a reference for psychologists, media analysts, researchers and speech writers to detect cues in the inconsistencies in behaviour and language.

Details

Journal of Systems and Information Technology, vol. 20 no. 3

Type: Research Article

DOI:

ISSN: 1328-7265

Keywords

Access

Year

Content type

1 – 10 of 134