Search results

1 – 10 of 128
Article
Publication date: 4 August 2020

Imane Guellil, Ahsan Adeel, Faical Azouaou, Sara Chennoufi, Hanene Maafi and Thinhinane Hamitouche

This paper aims to propose an approach for hate speech detection against politicians in Arabic community on social media (e.g. Youtube). In the literature, similar works have been…

Abstract

Purpose

This paper aims to propose an approach for hate speech detection against politicians in Arabic community on social media (e.g. Youtube). In the literature, similar works have been presented for other languages such as English. However, to the best of the authors’ knowledge, not much work has been conducted in the Arabic language.

Design/methodology/approach

This approach uses both classical algorithms of classification and deep learning algorithms. For the classical algorithms, the authors use Gaussian NB (GNB), Logistic Regression (LR), Random Forest (RF), SGD Classifier (SGD) and Linear SVC (LSVC). For the deep learning classification, four different algorithms (convolutional neural network (CNN), multilayer perceptron (MLP), long- or short-term memory (LSTM) and bi-directional long- or short-term memory (Bi-LSTM) are applied. For extracting features, the authors use both Word2vec and FastText with their two implementations, namely, Skip Gram (SG) and Continuous Bag of Word (CBOW).

Findings

Simulation results demonstrate the best performance of LSVC, BiLSTM and MLP achieving an accuracy up to 91%, when it is associated to SG model. The results are also shown that the classification that has been done on balanced corpus are more accurate than those done on unbalanced corpus.

Originality/value

The principal originality of this paper is to construct a new hate speech corpus (Arabic_fr_en) which was annotated by three different annotators. This corpus contains the three languages used by Arabic people being Arabic, French and English. For Arabic, the corpus contains both script Arabic and Arabizi (i.e. Arabic words written with Latin letters). Another originality is to rely on both shallow and deep leaning classification by using different model for extraction features such as Word2vec and FastText with their two implementation SG and CBOW.

Details

International Journal of Web Information Systems, vol. 16 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 22 March 2022

Djamila Mohdeb, Meriem Laifa, Fayssal Zerargui and Omar Benzaoui

The present study was designed to investigate eight research questions that are related to the analysis and the detection of dialectal Arabic hate speech that targeted African…

Abstract

Purpose

The present study was designed to investigate eight research questions that are related to the analysis and the detection of dialectal Arabic hate speech that targeted African refugees and illegal migrants on the YouTube Algerian space.

Design/methodology/approach

The transfer learning approach which recently presents the state-of-the-art approach in natural language processing tasks has been exploited to classify and detect hate speech in Algerian dialectal Arabic. Besides, a descriptive analysis has been conducted to answer the analytical research questions that aim at measuring and evaluating the presence of the anti-refugee/migrant discourse on the YouTube social platform.

Findings

Data analysis revealed that there has been a gradual modest increase in the number of anti-refugee/migrant hateful comments on YouTube since 2014, a sharp rise in 2017 and a sharp decline in later years until 2021. Furthermore, our findings stemming from classifying hate content using multilingual and monolingual pre-trained language transformers demonstrate a good performance of the AraBERT monolingual transformer in comparison with the monodialectal transformer DziriBERT and the cross-lingual transformers mBERT and XLM-R.

Originality/value

Automatic hate speech detection in languages other than English is quite a challenging task that the literature has tried to address by various approaches of machine learning. Although the recent approach of cross-lingual transfer learning offers a promising solution, tackling this problem in the context of the Arabic language, particularly dialectal Arabic makes it even more challenging. Our results cast a new light on the actual ability of the transfer learning approach to deal with low-resource languages that widely differ from high-resource languages as well as other Latin-based, low-resource languages.

Details

Aslib Journal of Information Management, vol. 74 no. 6
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 3 November 2020

Femi Emmanuel Ayo, Olusegun Folorunso, Friday Thomas Ibharalu and Idowu Ademola Osinuga

Hate speech is an expression of intense hatred. Twitter has become a popular analytical tool for the prediction and monitoring of abusive behaviors. Hate speech detection with…

Abstract

Purpose

Hate speech is an expression of intense hatred. Twitter has become a popular analytical tool for the prediction and monitoring of abusive behaviors. Hate speech detection with social media data has witnessed special research attention in recent studies, hence, the need to design a generic metadata architecture and efficient feature extraction technique to enhance hate speech detection.

Design/methodology/approach

This study proposes a hybrid embeddings enhanced with a topic inference method and an improved cuckoo search neural network for hate speech detection in Twitter data. The proposed method uses a hybrid embeddings technique that includes Term Frequency-Inverse Document Frequency (TF-IDF) for word-level feature extraction and Long Short Term Memory (LSTM) which is a variant of recurrent neural networks architecture for sentence-level feature extraction. The extracted features from the hybrid embeddings then serve as input into the improved cuckoo search neural network for the prediction of a tweet as hate speech, offensive language or neither.

Findings

The proposed method showed better results when tested on the collected Twitter datasets compared to other related methods. In order to validate the performances of the proposed method, t-test and post hoc multiple comparisons were used to compare the significance and means of the proposed method with other related methods for hate speech detection. Furthermore, Paired Sample t-Test was also conducted to validate the performances of the proposed method with other related methods.

Research limitations/implications

Finally, the evaluation results showed that the proposed method outperforms other related methods with mean F1-score of 91.3.

Originality/value

The main novelty of this study is the use of an automatic topic spotting measure based on naïve Bayes model to improve features representation.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 13 no. 4
Type: Research Article
ISSN: 1756-378X

Keywords

Abstract

Details

Learning and Teaching in Higher Education: Gulf Perspectives, vol. 8 no. 2
Type: Research Article
ISSN: 2077-5504

Article
Publication date: 6 January 2022

Hanan Alghamdi and Ali Selamat

With the proliferation of terrorist/extremist websites on the World Wide Web, it has become progressively more crucial to detect and analyze the content on these websites…

Abstract

Purpose

With the proliferation of terrorist/extremist websites on the World Wide Web, it has become progressively more crucial to detect and analyze the content on these websites. Accordingly, the volume of previous research focused on identifying the techniques and activities of terrorist/extremist groups, as revealed by their sites on the so-called dark web, has also grown.

Design/methodology/approach

This study presents a review of the techniques used to detect and process the content of terrorist/extremist sites on the dark web. Forty of the most relevant data sources were examined, and various techniques were identified among them.

Findings

Based on this review, it was found that methods of feature selection and feature extraction can be used as topic modeling with content analysis and text clustering.

Originality/value

At the end of the review, present the current state-of-the- art and certain open issues associated with Arabic dark Web content analysis.

Details

Data Technologies and Applications, vol. 56 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 27 February 2023

Meriem Laifa and Djamila Mohdeb

This study provides an overview of the application of sentiment analysis (SA) in exploring social movements (SMs). It also compares different models for a SA task of Algerian…

Abstract

Purpose

This study provides an overview of the application of sentiment analysis (SA) in exploring social movements (SMs). It also compares different models for a SA task of Algerian Arabic tweets related to early days of the Algerian SM, called Hirak.

Design/methodology/approach

Related tweets were retrieved using relevant hashtags followed by multiple data cleaning procedures. Foundational machine learning methods such as Naive Bayes, Support Vector Machine, Logistic Regression (LR) and Decision Tree were implemented. For each classifier, two feature extraction techniques were used and compared, namely Bag of Words and Term Frequency–Inverse Document Frequency. Moreover, three fine-tuned pretrained transformers AraBERT and DziriBERT and the multilingual transformer XLM-R were used for the comparison.

Findings

The findings of this paper emphasize the vital role social media played during the Hirak. Results revealed that most individuals had a positive attitude toward the Hirak. Moreover, the presented experiments provided important insights into the possible use of both basic machine learning and transfer learning models to analyze SA of Algerian text datasets. When comparing machine learning models with transformers in terms of accuracy, precision, recall and F1-score, the results are fairly similar, with LR outperforming all models with a 68 per cent accuracy rate.

Originality/value

At the time of writing, the Algerian SM was not thoroughly investigated or discussed in the Computer Science literature. This analysis makes a limited but unique contribution to understanding the Algerian Hirak using artificial intelligence. This study proposes what it considers to be a unique basis for comprehending this event with the goal of generating a foundation for future studies by comparing different SA techniques on a low-resource language.

Details

Data Technologies and Applications, vol. 57 no. 5
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 1 March 2003

W.A.C Adie MA

Roots of global Terrorism are in ‘failed’ states carved out of multiracial empires after World Wars I and II in name of ‘national self‐determination’. Both sides in the Cold War…

Abstract

Roots of global Terrorism are in ‘failed’ states carved out of multiracial empires after World Wars I and II in name of ‘national self‐determination’. Both sides in the Cold War competed to exploit the process of disintegration with armed and covert interventions. In effect, they were colluding at the expense of the ‘liberated’ peoples. The ‘Vietnam Trauma’ prevented effective action against the resulting terrorist buildup and blowback until 9/11. As those vultures come home to roost, the war broadens to en vision overdue but coercive reforms to the postwar system of nation states, first in the Middle East. Mirages of Vietnam blur the vision; can the sole Superpower finish the job before fiscal and/or imperial overstretch implode it?

Details

International Journal of Commerce and Management, vol. 13 no. 3/4
Type: Research Article
ISSN: 1056-9219

Open Access
Article
Publication date: 29 June 2022

Ibtissam Touahri

This paper purposed a multi-facet sentiment analysis system.

Abstract

Purpose

This paper purposed a multi-facet sentiment analysis system.

Design/methodology/approach

Hence, This paper uses multidomain resources to build a sentiment analysis system. The manual lexicon based features that are extracted from the resources are fed into a machine learning classifier to compare their performance afterward. The manual lexicon is replaced with a custom BOW to deal with its time consuming construction. To help the system run faster and make the model interpretable, this will be performed by employing different existing and custom approaches such as term occurrence, information gain, principal component analysis, semantic clustering, and POS tagging filters.

Findings

The proposed system featured by lexicon extraction automation and characteristics size optimization proved its efficiency when applied to multidomain and benchmark datasets by reaching 93.59% accuracy which makes it competitive to the state-of-the-art systems.

Originality/value

The construction of a custom BOW. Optimizing features based on existing and custom feature selection and clustering approaches.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Open Access
Article
Publication date: 14 February 2022

Mohammad Fraiwan

Social networks (SNs) have recently evolved from a means of connecting people to becoming a tool for social engineering, radicalization, dissemination of propaganda and…

1513

Abstract

Purpose

Social networks (SNs) have recently evolved from a means of connecting people to becoming a tool for social engineering, radicalization, dissemination of propaganda and recruitment of terrorists. It is no secret that the majority of the Islamic State in Iraq and Syria (ISIS) members are Arabic speakers, and even the non-Arabs adopt Arabic nicknames. However, the majority of the literature researching the subject deals with non-Arabic languages. Moreover, the features involved in identifying radical Islamic content are shallow and the search or classification terms are common in daily chatter among people of the region. The authors aim at distinguishing normal conversation, influenced by the role religion plays in daily life, from terror-related content.

Design/methodology/approach

This article presents the authors' experience and the results of collecting, analyzing and classifying Twitter data from affiliated members of ISIS, as well as sympathizers. The authors used artificial intelligence (AI) and machine learning classification algorithms to categorize the tweets, as terror-related, generic religious, and unrelated.

Findings

The authors report the classification accuracy of the K-nearest neighbor (KNN), Bernoulli Naive Bayes (BNN) and support vector machine (SVM) [one-against-all (OAA) and all-against-all (AAA)] algorithms. The authors achieved a high classification F1 score of 83\%. The work in this paper will hopefully aid more accurate classification of radical content.

Originality/value

In this paper, the authors have collected and analyzed thousands of tweets advocating and promoting ISIS. The authors have identified many common markers and keywords characteristic of ISIS rhetoric. Moreover, the authors have applied text processing and AI machine learning techniques to classify the tweets into one of three categories: terror-related, non-terror political chatter and news and unrelated data-polluting tweets.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 1 February 2000

Muin‐ud‐Din Ahmad Khan

Politics has three contingent aspects, viz. a role‐play, a Majestic Art and a human social science. We may talk of the first aspect as just politicking. The second aspect was…

Abstract

Politics has three contingent aspects, viz. a role‐play, a Majestic Art and a human social science. We may talk of the first aspect as just politicking. The second aspect was delineated by Aristotle as an art of doing vis‐a‐vis the art of making: He meant by the former a behavioural art and by the latter a productive art; and the third aspect represents a comprehensive analytical study of human social behaviour.

Details

Humanomics, vol. 16 no. 2
Type: Research Article
ISSN: 0828-8666

1 – 10 of 128