Search results

1 – 10 of over 1000
Article
Publication date: 16 August 2021

Nael Alqtati, Jonathan A.J. Wilson and Varuna De Silva

This paper aims to equip professionals and researchers in the fields of advertising, branding, public relations, marketing communications, social media analytics and marketing…

Abstract

Purpose

This paper aims to equip professionals and researchers in the fields of advertising, branding, public relations, marketing communications, social media analytics and marketing with a simple, effective and dynamic means of evaluating consumer behavioural sentiments and engagement through Arabic language and script, in vivo.

Design/methodology/approach

Using quantitative and qualitative situational linguistic analyses of Classical Arabic, found in Quranic and religious texts scripts; Modern Standard Arabic, which is commonly used in formal Arabic channels; and dialectical Arabic, which varies hugely from one Arabic country to another: this study analyses rich marketing and consumer messages (tweets) – as a basis for developing an Arabic language social media methodological tool.

Findings

Despite the popularity of Arabic language communication on social media platforms across geographies, currently, comprehensive language processing toolkits for analysing Arabic social media conversations have limitations and require further development. Furthermore, due to its unique morphology, developing text understanding capabilities specific to the Arabic language poses challenges.

Practical implications

This study demonstrates the application and effectiveness of the proposed methodology on a random sample of Twitter data from Arabic-speaking regions. Furthermore, as Arabic is the language of Islam, the study is of particular importance to Islamic and Muslim geographies, markets and marketing.

Social implications

The findings suggest that the proposed methodology has a wider potential beyond the data set and health-care sector analysed, and therefore, can be applied to further markets, social media platforms and consumer segments.

Originality/value

To remedy these gaps, this study presents a new methodology and analytical approach to investigating Arabic language social media conversations, which brings together a multidisciplinary knowledge of technology, data science and marketing communications.

Article
Publication date: 22 March 2022

Djamila Mohdeb, Meriem Laifa, Fayssal Zerargui and Omar Benzaoui

The present study was designed to investigate eight research questions that are related to the analysis and the detection of dialectal Arabic hate speech that targeted African…

Abstract

Purpose

The present study was designed to investigate eight research questions that are related to the analysis and the detection of dialectal Arabic hate speech that targeted African refugees and illegal migrants on the YouTube Algerian space.

Design/methodology/approach

The transfer learning approach which recently presents the state-of-the-art approach in natural language processing tasks has been exploited to classify and detect hate speech in Algerian dialectal Arabic. Besides, a descriptive analysis has been conducted to answer the analytical research questions that aim at measuring and evaluating the presence of the anti-refugee/migrant discourse on the YouTube social platform.

Findings

Data analysis revealed that there has been a gradual modest increase in the number of anti-refugee/migrant hateful comments on YouTube since 2014, a sharp rise in 2017 and a sharp decline in later years until 2021. Furthermore, our findings stemming from classifying hate content using multilingual and monolingual pre-trained language transformers demonstrate a good performance of the AraBERT monolingual transformer in comparison with the monodialectal transformer DziriBERT and the cross-lingual transformers mBERT and XLM-R.

Originality/value

Automatic hate speech detection in languages other than English is quite a challenging task that the literature has tried to address by various approaches of machine learning. Although the recent approach of cross-lingual transfer learning offers a promising solution, tackling this problem in the context of the Arabic language, particularly dialectal Arabic makes it even more challenging. Our results cast a new light on the actual ability of the transfer learning approach to deal with low-resource languages that widely differ from high-resource languages as well as other Latin-based, low-resource languages.

Details

Aslib Journal of Information Management, vol. 74 no. 6
Type: Research Article
ISSN: 2050-3806

Keywords

Open Access
Article
Publication date: 23 November 2023

Reema Khaled AlRowais and Duaa Alsaeed

Automatically extracting stance information from natural language texts is a significant research problem with various applications, particularly after the recent explosion of…

402

Abstract

Purpose

Automatically extracting stance information from natural language texts is a significant research problem with various applications, particularly after the recent explosion of data on the internet via platforms like social media sites. Stance detection system helps determine whether the author agree, against or has a neutral opinion with the given target. Most of the research in stance detection focuses on the English language, while few research was conducted on the Arabic language.

Design/methodology/approach

This paper aimed to address stance detection on Arabic tweets by building and comparing different stance detection models using four transformers, namely: Araelectra, MARBERT, AraBERT and Qarib. Using different weights for these transformers, the authors performed extensive experiments fine-tuning the task of stance detection Arabic tweets with the four different transformers.

Findings

The results showed that the AraBERT model learned better than the other three models with a 70% F1 score followed by the Qarib model with a 68% F1 score.

Research limitations/implications

A limitation of this study is the imbalanced dataset and the limited availability of annotated datasets of SD in Arabic.

Originality/value

Provide comprehensive overview of the current resources for stance detection in the literature, including datasets and machine learning methods used. Therefore, the authors examined the models to analyze and comprehend the obtained findings in order to make recommendations for the best performance models for the stance detection task.

Details

Arab Gulf Journal of Scientific Research, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1985-9899

Keywords

Article
Publication date: 25 February 2022

Souheila Ben Guirat, Ibrahim Bounhas and Yahya Slimani

The semantic relations between Arabic word representations were recognized and widely studied in theoretical studies in linguistics many centuries ago. Nonetheless, most of the…

Abstract

Purpose

The semantic relations between Arabic word representations were recognized and widely studied in theoretical studies in linguistics many centuries ago. Nonetheless, most of the previous research in automatic information retrieval (IR) focused on stem or root-based indexing, while lemmas and patterns are under-exploited. However, the authors believe that each of the four morphological levels encapsulates a part of the meaning of words. That is, the purpose is to aggregate these levels using more sophisticated approaches to reach the optimal combination which enhances IR.

Design/methodology/approach

The authors first compare the state-of-the art Arabic natural language processing (NLP) tools in IR. This allows to select the most accurate tool in each representation level i.e. developing four basic IR systems. Then, the authors compare two rank aggregation approaches which combine the results of these systems. The first approach is based on linear combination, while the second exploits classification-based meta-search.

Findings

Combining different word representation levels, consistently and significantly enhances IR results. The proposed classification-based approach outperforms linear combination and all the basic systems.

Research limitations/implications

The work stands by a standard experimental comparative study which assesses several NLP tools and combining approaches on different test collections and IR models. Thus, it may be helpful for future research works to choose the most suitable tools and develop more sophisticated methods for handling the complexity of Arabic language.

Originality/value

The originality of the idea is to consider that the richness of Arabic is an exploitable characteristic and no more a challenging limit. Thus, the authors combine 4 different morphological levels for the first time in Arabic IR. This approach widely overtook previous research results.

Peer review

The peer review history for this article is available at: https://publons.com/publon/10.1108/OIR-11-2020-0515

Details

Online Information Review, vol. 46 no. 7
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 27 June 2023

Syihabuddin Syihabuddin, Nurul Murtadho, Yusring Sanusi Baso, Hikmah Maulani and Shofa Musthofa Khalid

Assessing whether a book is relevant or suitable for use in teaching materials is not an easy and haphazard matter, various methods and theories have been offered by researchers…

Abstract

Purpose

Assessing whether a book is relevant or suitable for use in teaching materials is not an easy and haphazard matter, various methods and theories have been offered by researchers in studying this matter. Taking a study of the context of textbooks, researchers found the urgency that textbooks are a foundation for education, socialization and transmission of knowledge and its construction. Researchers offer another approach, namely by using praxeology as a study tool so that the goals of the textbooks previously intended are fulfilled.

Design/methodology/approach

The researcher uses a qualitative approach through grounded theory. Grounded theory procedures are designed to develop a well-integrated set of concepts that provide a thorough theoretical explanation of the social phenomena under study. A grounded theory must explain as well as describe. It may also implicitly provide some degree of predictability, but only with respect to certain conditions (Corbin and Strauss, 1990). Document analysis in conducting this research study. Document analysis itself examines systematic procedures for reviewing or evaluating documents, both printed and electronic materials.

Findings

Two issues regarding gender acquisition have been investigated in L2 Arabic acquisition studies; the order in which L2 Arabic learners acquire certain grammatical features of the gender system and the effect of L1 on the acquisition of some grammatical features from L2 grammatical gender. Arabic has a two-gender system that classifies all nouns, animate and inanimate, as masculine or feminine. Verbs, nouns, adjectives, personal, demonstrative and relative pronouns related to nouns in the syntactic structure of sentences show gender agreement.

Research limitations/implications

In practice, as a book intended for non-speakers, the book is presented using a general view of linguistic theory. In relation to the gender agreement, the presentation of the book begins and is inserted with the concepts of nouns and verbs. Returning to the praxeology context, First, The Know How (Praxis) explains practice (i.e. the tasks performed and the techniques used). Second, To Know Why or Knowledge (logos) which explains and justifies practice from a technological and theoretical point of view. Answering the first concept, the exercise presented in the book is a concept with three clusters explained at the beginning of the discussion. And the second concept, explained with a task design approach which includes word categorization by separating masculine and feminine word forms.

Practical implications

Practically, this research obtains perspectives studied from a textbook, namely the Arabic gender agreement is presented with various examples of noun contexts; textbook authors present book concepts in a particular way with regard to curriculum features and this task design affects student performance, and which approach is more effective for developing student understanding. Empirically, the material is in line with the formulation of competency standards for non-Arabic speakers in Indonesia.

Originality/value

With this computational search, the researcher found a novelty that was considered accurate by taking the praxeology context as a review in the analysis of non-speaking Arabic textbooks, especially in the year 2022 (last data collection in September) there has been no study on this context. So then, the researcher finds other interests in that praxeology can examine more broadly parts of the task of the contents of the book with the approach of relevant linguistic theories.

Details

Journal of Applied Research in Higher Education, vol. 16 no. 4
Type: Research Article
ISSN: 2050-7003

Keywords

Article
Publication date: 18 April 2017

Mahmoud Al-Ayyoub, Ahmed Alwajeeh and Ismail Hmeidi

The authorship authentication (AA) problem is concerned with correctly attributing a text document to its corresponding author. Historically, this problem has been the focus of…

Abstract

Purpose

The authorship authentication (AA) problem is concerned with correctly attributing a text document to its corresponding author. Historically, this problem has been the focus of various studies focusing on the intuitive idea that each author has a unique style that can be captured using stylometric features (SF). Another approach to this problem, known as the bag-of-words (BOW) approach, uses keywords occurrences/frequencies in each document to identify its author. Unlike the first one, this approach is more language-independent. This paper aims to study and compare both approaches focusing on the Arabic language which is still largely understudied despite its importance.

Design/methodology/approach

Being a supervised learning problem, the authors start by collecting a very large data set of Arabic documents to be used for training and testing purposes. For the SF approach, they compute hundreds of SF, whereas, for the BOW approach, the popular term frequency-inverse document frequency technique is used. Both approaches are compared under various settings.

Findings

The results show that the SF approach, which is much cheaper to train, can generate more accurate results under most settings.

Practical implications

Numerous advantages of efficiently solving the AA problem are obtained in different fields of academia as well as the industry including literature, security, forensics, electronic markets and trading, etc. Another practical implication of this work is the public release of its sources. Specifically, some of the SF can be very useful for other problems such as sentiment analysis.

Originality/value

This is the first study of its kind to compare the SF and BOW approaches for authorship analysis of Arabic articles. Moreover, many of the computed SF are novel, while other features are inspired by the literature. As SF are language-dependent and most existing papers focus on English, extra effort must be invested to adapt such features to Arabic text.

Details

International Journal of Web Information Systems, vol. 13 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 2 September 2019

Guellil Imane, Darwish Kareem and Azouaou Faical

This paper aims to propose an approach to automatically annotate a large corpus in Arabic dialect. This corpus is used in order to analyse sentiments of Arabic users on social…

Abstract

Purpose

This paper aims to propose an approach to automatically annotate a large corpus in Arabic dialect. This corpus is used in order to analyse sentiments of Arabic users on social medias. It focuses on the Algerian dialect, which is a sub-dialect of Maghrebi Arabic. Although Algerian is spoken by roughly 40 million speakers, few studies address the automated processing in general and the sentiment analysis in specific for Algerian.

Design/methodology/approach

The approach is based on the construction and use of a sentiment lexicon to automatically annotate a large corpus of Algerian text that is extracted from Facebook. Using this approach allow to significantly increase the size of the training corpus without calling the manual annotation. The annotated corpus is then vectorized using document embedding (doc2vec), which is an extension of word embeddings (word2vec). For sentiments classification, the authors used different classifiers such as support vector machines (SVM), Naive Bayes (NB) and logistic regression (LR).

Findings

The results suggest that NB and SVM classifiers generally led to the best results and MLP generally had the worst results. Further, the threshold that the authors use in selecting messages for the training set had a noticeable impact on recall and precision, with a threshold of 0.6 producing the best results. Using PV-DBOW led to slightly higher results than using PV-DM. Combining PV-DBOW and PV-DM representations led to slightly lower results than using PV-DBOW alone. The best results were obtained by the NB classifier with F1 up to 86.9 per cent.

Originality/value

The principal originality of this paper is to determine the right parameters for automatically annotating an Algerian dialect corpus. This annotation is based on a sentiment lexicon that was also constructed automatically.

Details

International Journal of Web Information Systems, vol. 15 no. 5
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 4 August 2020

Imane Guellil, Ahsan Adeel, Faical Azouaou, Sara Chennoufi, Hanene Maafi and Thinhinane Hamitouche

This paper aims to propose an approach for hate speech detection against politicians in Arabic community on social media (e.g. Youtube). In the literature, similar works have been…

Abstract

Purpose

This paper aims to propose an approach for hate speech detection against politicians in Arabic community on social media (e.g. Youtube). In the literature, similar works have been presented for other languages such as English. However, to the best of the authors’ knowledge, not much work has been conducted in the Arabic language.

Design/methodology/approach

This approach uses both classical algorithms of classification and deep learning algorithms. For the classical algorithms, the authors use Gaussian NB (GNB), Logistic Regression (LR), Random Forest (RF), SGD Classifier (SGD) and Linear SVC (LSVC). For the deep learning classification, four different algorithms (convolutional neural network (CNN), multilayer perceptron (MLP), long- or short-term memory (LSTM) and bi-directional long- or short-term memory (Bi-LSTM) are applied. For extracting features, the authors use both Word2vec and FastText with their two implementations, namely, Skip Gram (SG) and Continuous Bag of Word (CBOW).

Findings

Simulation results demonstrate the best performance of LSVC, BiLSTM and MLP achieving an accuracy up to 91%, when it is associated to SG model. The results are also shown that the classification that has been done on balanced corpus are more accurate than those done on unbalanced corpus.

Originality/value

The principal originality of this paper is to construct a new hate speech corpus (Arabic_fr_en) which was annotated by three different annotators. This corpus contains the three languages used by Arabic people being Arabic, French and English. For Arabic, the corpus contains both script Arabic and Arabizi (i.e. Arabic words written with Latin letters). Another originality is to rely on both shallow and deep leaning classification by using different model for extraction features such as Word2vec and FastText with their two implementation SG and CBOW.

Details

International Journal of Web Information Systems, vol. 16 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

Open Access
Article
Publication date: 4 August 2020

Mohamed Boudchiche and Azzeddine Mazroui

We have developed in this paper a morphological disambiguation hybrid system for the Arabic language that identifies the stem, lemma and root of a given sentence words. Following…

Abstract

We have developed in this paper a morphological disambiguation hybrid system for the Arabic language that identifies the stem, lemma and root of a given sentence words. Following an out-of-context analysis performed by the morphological analyser Alkhalil Morpho Sys, the system first identifies all the potential tags of each word of the sentence. Then, a disambiguation phase is carried out to choose for each word the right solution among those obtained during the first phase. This problem has been solved by equating the disambiguation issue with a surface optimization problem of spline functions. Tests have shown the interest of this approach and the superiority of its performances compared to those of the state of the art.

Details

Applied Computing and Informatics, vol. 20 no. 3/4
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 13 July 2020

Issam Tlemsani, Farhi Marir and Munir Majdalawieh

This paper revolves around the usage of data analytics in the Qur’an and Hadith through a new text mining technique to answer the main research question of whether the activities…

Abstract

Purpose

This paper revolves around the usage of data analytics in the Qur’an and Hadith through a new text mining technique to answer the main research question of whether the activities and the data flows of the Murabaha financing contract is compatible with Sharia law. The purpose of this paper is to provide a thorough and comprehensive database that will be used to examine existing practices in Islamic banks’ and improve compliancy with Islamic financial law (Sharia).

Design/methodology/approach

To design a Sharia-compliant Murabaha business process originated on text mining, the authors start by identifying the factors deemed necessary in their text mining techniques of both texts; using a four-step strategy to analyze those text mining analytics; then, they list the three basic approaches in text mining used for new knowledge discovery in databases: the co-occurrence approach based on the recursive co-occurrence algorithm; the machine learning or statistical-based; and the knowledge-based. They identify any variation and association between the Murabaha business processes produced using text mining against the one developed through data collection.

Findings

The main finding attained in this paper is to confirm the compatibility of all activities and the data flows in the Murabaha financing contract produced using data analytics of the Quran and Hadith texts against the Murabaha business process that was developed based on data collection. Another key finding is revealing some shortcomings regarding Islamic banks business process compliance with Sharia law.

Practical implications

Given Murabaha as the most popular mode of Islamic financing with more than 75% in total transactions, this research has managed to touch-base on an area that is interesting to the vast majority of those dealing with Islamic finance instruments. By reaching findings that could improve the existing Islamic Murabaha business process and concluding on Sharia compliance of the existing Murabaha business process, this research is quite relevant and could be used in practice as well as in influencing public policy. In fact, Islamic Sharia law experts, Islamic finance professionals and Islamic banks may find the results of this study very useful in improving at least one aspect of the Islamic finance transactions.

Originality/value

By using a novel, fresh text mining methods built on recursive occurrence of synonym words from the Qur’an and Hadith to enrich Islamic finance, this research study can claim to have been the first of its kind in using machine learning to mine the Quran, Hadith and in extracting valuable knowledge to support and consolidate the Islamic financial business processes and make them more compliant with the i.

Details

Journal of Islamic Accounting and Business Research, vol. 11 no. 10
Type: Research Article
ISSN: 1759-0817

Keywords

1 – 10 of over 1000