Search results

1 – 10 of 404
Open Access
Article
Publication date: 17 July 2020

Imad Zeroual and Abdelhak Lakhouaja

Recently, more data-driven approaches are demanding multilingual parallel resources primarily in the cross-language studies. To meet these demands, building multilingual parallel…

2564

Abstract

Recently, more data-driven approaches are demanding multilingual parallel resources primarily in the cross-language studies. To meet these demands, building multilingual parallel corpora are becoming the focus of many Natural Language Processing (NLP) scientific groups. Unlike monolingual corpora, the number of available multilingual parallel corpora is limited. In this paper, the MulTed, a corpus of subtitles extracted from TEDx talks is introduced. It is multilingual, Part of Speech (PoS) tagged, and bilingually sentence-aligned with English as a pivot language. This corpus is designed for many NLP applications, where the sentence-alignment, the PoS tagging, and the size of corpora are influential such as statistical machine translation, language recognition, and bilingual dictionary generation. Currently, the corpus has subtitles that cover 1100 talks available in over 100 languages. The subtitles are classified based on a variety of topics such as Business, Education, and Sport. Regarding the PoS tagging, the Treetagger, a language-independent PoS tagger, is used; then, to make the PoS tagging maximally useful, a mapping process to a universal common tagset is performed. Finally, we believe that making the MulTed corpus available for a public use can be a significant contribution to the literature of NLP and corpus linguistics, especially for under-resourced languages.

Details

Applied Computing and Informatics, vol. 18 no. 1/2
Type: Research Article
ISSN: 2210-8327

Keywords

Open Access
Article
Publication date: 13 July 2020

Dalia Hamed

The purpose of this study is to apply a corpus-assisted analysis of keywords and their collocations in the US presidential discourse from Clinton to Trump to discover the meanings…

4046

Abstract

Purpose

The purpose of this study is to apply a corpus-assisted analysis of keywords and their collocations in the US presidential discourse from Clinton to Trump to discover the meanings of these words and the collocates they have. Keywords are salient words in a corpus whose frequency is unusually high (positive keywords) or low (negative keywords) in comparison with a reference corpus. Collocation is the co-occurrence of words.

Design/methodology/approach

To achieve this purpose, the investigation of keywords and collocations is generated by AntConc, a corpus processing software.

Findings

This analysis leads to shed light on the similarities and/or differences amongst the past four American presidents concerning their key topics. Keyword analysis through keyness makes it evident that Clinton and Obama, being Democrats, demonstrate a clear tendency to improve Americans’ life inside their social sphere. Obama surpasses Clinton as regard foreign affairs. Clinton and Obama’s infrequent subjects have to do with terrorism and immigration. This complies with their condensed focus on social and economic improvements. Bush, a republican, concentrates only on external issues. This is proven by his keywords signifying war against terrorism. Bush’s negative use of words marking cooperative actions conforms to his positive use of words indicating external war. Trump’s positive keywords are about exaggerated descriptions without a defined target. He also shows an unusual frequency in referring to his name and position. His words used with negative keyness refer to reforming programs and external issues. Collocations around each top content keyword clarify the word and harmonize with the presidential orientation negotiated by the keywords.

Research limitations/implications

Limitations have to do with the issue of the accurate representation of the samples.

Originality/value

This research is original in its methodology of applying corpus linguistics tools in the analysis of presidential discourses.

Details

Journal of Humanities and Applied Social Sciences, vol. 3 no. 2
Type: Research Article
ISSN: 2632-279X

Keywords

Open Access
Article
Publication date: 30 May 2022

Amani Mejri

This corpus-based study provides a descriptive account of the distribution of the polysemous noun nafs in two Arabic varieties, Modern Standard Arabic (MSA) and Classical Arabic…

Abstract

Purpose

This corpus-based study provides a descriptive account of the distribution of the polysemous noun nafs in two Arabic varieties, Modern Standard Arabic (MSA) and Classical Arabic (CA). The research objective is to survey the use of nafs as a reflexive marker in local binding domains and as a self-intensifier in NP-adjoined positions.

Design/methodology/approach

The consulted corpora are Timespamped JSI Web corpus for MSA and Quran corpus for CA. While attending to corpora size differences, MSA and CA exhibit a pattern of difference and similarity in nafs diffusion.

Findings

In the modern variety, nafs is pervasively used as reflexive marker in canonical binding domains, along with a less frequent, yet notable, intensifier user, and these uses are partially and cautiously attributed to the specific genre in which they occur. In CA, nafs is mainly recurrent as a polysemous noun, along with extensive use as a reflexive marker in local binding settings. As an intensifier, nafs is totally non-existent in the CA corpus, in the same way as it is in absentia in VP-constituent extraction in MSA.

Originality/value

Examining whether nafs, as a reflexive marker, deviates from canonical binding in Arabic the way English reflexive pronouns do. Building a general account of this distribution is relevant in understanding the explicit (syntactic) and implicit (discourse-based) dimensions of reflexive marker and self-intensifier processing and interpretation in Arabic as a first and second language.

Details

Saudi Journal of Language Studies, vol. 2 no. 2
Type: Research Article
ISSN: 2634-243X

Keywords

Open Access
Article
Publication date: 7 June 2021

Azniza Hartini Azrai Azaimi Ambrose and Fadhilah Abdullah Asuhaimi

The purpose of this paper is to comprehensively discuss the issue of risk vis-à-vis the perpetuity restriction principle inherent in waqf (Islamic endowment). Specifically, it…

3850

Abstract

Purpose

The purpose of this paper is to comprehensively discuss the issue of risk vis-à-vis the perpetuity restriction principle inherent in waqf (Islamic endowment). Specifically, it attempts to consolidate the axioms in both conventional and Islamic finance, such as the risk-return trade-off and al-ghunm bi al-ghurm (liability accompanies gain), with the perpetual nature of waqf. Overall, this paper attempts to find a resolution to the dilemma of perpetuity restriction inherent in cash waqf against the natural occurrence of the risk.

Design/methodology/approach

This paper is based on the secondary research methodology; past literature encompassing journal articles, books, relevant financial axioms, fatwas (Islamic rulings) and state enactments is critically reviewed to present its case. In regard to state enactments, only Malaysian state enactments have been used, thus restricting the study to the Malaysian case only.

Findings

This study contends that the dilemma of the perpetuity restriction and the natural occurrence of risk can be resolved through the integration of waqf risk management, especially concerning cash waqf, with the Islamic spiritual approach. By implementing standard operating procedures that inculcate awareness on waqf risk management and Islamic spirituality in waqf stakeholders (wāqif (donor), trustee and beneficiaries), the stakeholders may accept the reality of risk that is inevitable even after all efforts have been exhausted. In other words, the violation of perpetuity is exonerated given that mental faculties aligned with revealed texts have been exhaustively used beforehand.

Practical implications

Findings from this study may broaden the choice of investment avenues for waqf trustees while adhering to the perpetual restriction of waqf. More importantly, waqf trustees will not be forced to invest in interest-bearing securities or be involved in any usurious transactions just to obtain guaranteed returns and preserve the corpus of waqf.

Originality/value

This study offers a unique perspective on cash waqf risk management by re-analyzing the axioms and concepts of finance and waqf while observing the welfare of the beneficiaries.

Details

ISRA International Journal of Islamic Finance, vol. 13 no. 2
Type: Research Article
ISSN: 0128-1976

Keywords

Open Access
Article
Publication date: 30 June 2023

Carmel Bond, Gemma Stacey, Greta Westwood and Louisa Long

The purpose of this paper is to evaluate the impact of leadership development programmes, underpinned by Transformational Learning Theory (TLT).

1502

Abstract

Purpose

The purpose of this paper is to evaluate the impact of leadership development programmes, underpinned by Transformational Learning Theory (TLT).

Design/methodology/approach

A corpus-informed analysis was conducted using survey data from 690 participants. Data were collected from participants’ responses to the question “please tell us about the impact of your overall experience”, which culminated in a combined corpus of 75,053 words.

Findings

Findings identified patterns of language clustered around the following frequently used word types, namely, confidence; influence; self-awareness; insight; and impact.

Research limitations/implications

This in-depth qualitative evaluation of participants’ feedback has provided insight into how TLT can be applied to develop future health-care leaders. The extent to which learning has had a transformational impact at the individual level, in relation to their perceived ability to influence, holds promise for the wider impact of this group in relation to policy, practice and the promotion of clinical excellence in the future. However, the latter can only be ascertained by undertaking further realist evaluation and longitudinal study to understand the mechanisms by which transformational learning occurs and is successfully translated to influence in practice.

Originality/value

Previous research has expounded traditional leadership theories to guide the practice of health-care leadership development. The paper goes some way to demonstrate the impact of using the principles of TLT within health-care leadership development programmes. The approach taken by The Florence Nightingale Foundation has the potential to generate confident leaders who may be instrumental in creating positive changes across various clinical environments.

Details

Leadership in Health Services, vol. 37 no. 5
Type: Research Article
ISSN: 1751-1879

Keywords

Open Access
Article
Publication date: 1 September 2021

Rosita Belinda Maglie and Laura Centonze

The purpose of this paper is to explore two channels of communication (i.e. texts and images) from a non-governmental organization website called #DisruptAging with the aim of…

Abstract

Purpose

The purpose of this paper is to explore two channels of communication (i.e. texts and images) from a non-governmental organization website called #DisruptAging with the aim of finding how multimodal knowledge dissemination contributes to dismantling misconceptions about the aging process.

Design/methodology/approach

This analysis is based on an integrated approach that combines corpus-assisted discourse analysis (cf. Semino and Short, 2004; Baker et al., 2008, Baker, 2010) and multimodal critical discourse analysis (Machin and Mayr, 2012) via the American Medical Association format (2007) and the suite of FrameWorks tools (2015, 2017), which are applied to the collection of texts and images taken from #DisruptAging.

Findings

A total of 69 stories corresponding with 218 images of older adults have shown to be powerful textual and semiotic resources, designed both for educational and awareness-raising purposes, to promote the so-called “aging well discourse” (cf. Loos et al., 2017).

Social implications

This discursive approach to the textual and visual material found in #DisruptAging hopes to influence the governing institutions that we construct, and the people who are given power to run them, with the goal of fostering fair treatment of older people within society.

Originality/value

There is a lack of studies investigating counter-discourse forms available online, which use textual and visual language to change the way society conceives the idea of aging.

Open Access
Article
Publication date: 8 December 2020

Matjaž Kragelj and Mirjana Kljajić Borštnar

The purpose of this study is to develop a model for automated classification of old digitised texts to the Universal Decimal Classification (UDC), using machine-learning methods.

2873

Abstract

Purpose

The purpose of this study is to develop a model for automated classification of old digitised texts to the Universal Decimal Classification (UDC), using machine-learning methods.

Design/methodology/approach

The general research approach is inherent to design science research, in which the problem of UDC assignment of the old, digitised texts is addressed by developing a machine-learning classification model. A corpus of 70,000 scholarly texts, fully bibliographically processed by librarians, was used to train and test the model, which was used for classification of old texts on a corpus of 200,000 items. Human experts evaluated the performance of the model.

Findings

Results suggest that machine-learning models can correctly assign the UDC at some level for almost any scholarly text. Furthermore, the model can be recommended for the UDC assignment of older texts. Ten librarians corroborated this on 150 randomly selected texts.

Research limitations/implications

The main limitations of this study were unavailability of labelled older texts and the limited availability of librarians.

Practical implications

The classification model can provide a recommendation to the librarians during their classification work; furthermore, it can be implemented as an add-on to full-text search in the library databases.

Social implications

The proposed methodology supports librarians by recommending UDC classifiers, thus saving time in their daily work. By automatically classifying older texts, digital libraries can provide a better user experience by enabling structured searches. These contribute to making knowledge more widely available and useable.

Originality/value

These findings contribute to the field of automated classification of bibliographical information with the usage of full texts, especially in cases in which the texts are old, unstructured and in which archaic language and vocabulary are used.

Details

Journal of Documentation, vol. 77 no. 3
Type: Research Article
ISSN: 0022-0418

Keywords

Open Access
Article
Publication date: 28 April 2020

Constance Mambet Doue, Oscar Navarro Carrascal, Diego Restrepo, Nathalie Krien, Delphine Rommel, Colin Lemee, Marie Coquet, Denis Mercier and Ghozlane Fleury-Bahi

Based on social representation theory, this study aims to evaluate and analyze the similarities and differences between social representations of climate change held by people…

2312

Abstract

Purpose

Based on social representation theory, this study aims to evaluate and analyze the similarities and differences between social representations of climate change held by people living in two territories, which have in common that they are exposed to coastal risks but have different socio-cultural contexts: on the one hand, Cartagena (Colombia) and on the other, Guadeloupe (French overseas department, France).

Design/methodology/approach

A double approach, both quantitative and qualitative, of social representation theory was adopted. The data collection was undertaken in two phases. First, the content and organization of social representation of climate change (SRCC) was examined with a quantitative study of 946 participants for both countries, followed by a qualitative study of 63 participants for both countries also.

Findings

The study finds unicity in the SRCC for the quantitative study. In contrast, the qualitative study highlights differences at the level of the institutional anchoring of the climate change phenomenon in these two different socioeconomic and political contexts.

Practical implications

These results are relevant for a reflection in terms of public policies for the prevention and management of collective natural risks, as well as for the promotion of ecological behavior adapted to political and ideological contexts.

Originality/value

The use of a multi-methodological approach (quantitative and qualitative) in the same research is valuable to confirm the importance of an in-depth study of the social representations of climate change because of the complexity of the phenomenon.

Details

International Journal of Climate Change Strategies and Management, vol. 12 no. 3
Type: Research Article
ISSN: 1756-8692

Keywords

Open Access
Article
Publication date: 18 November 2021

Shin'ichiro Ishikawa

Using a newly compiled corpus module consisting of utterances from Asian learners during L2 English interviews, this study examined how Asian EFL learners' L1s (Chinese…

Abstract

Purpose

Using a newly compiled corpus module consisting of utterances from Asian learners during L2 English interviews, this study examined how Asian EFL learners' L1s (Chinese, Indonesian, Japanese, Korean, Taiwanese and Thai), their L2 proficiency levels (A2, B1 low, B1 upper and B2+) and speech task types (picture descriptions, roleplays and QA-based conversations) affected four aspects of vocabulary usage (number of tokens, standardized type/token ratio, mean word length and mean sentence length).

Design/methodology/approach

Four aspects concern speech fluency, lexical richness, lexical complexity and structural complexity, respectively.

Findings

Subsequent corpus-based quantitative data analyses revealed that (1) learner/native speaker differences existed during the conversation and roleplay tasks in terms of the number of tokens, type/token ratio and sentence length; (2) an L1 group effect existed in all three task types in terms of the number of tokens and sentence length; (3) an L2 proficiency effect existed in all three task types in terms of the number of tokens, type-token ratio and sentence length; and (4) the usage of high-frequency vocabulary was influenced more strongly by the task type and it was classified into four types: Type A vocabulary for grammar control, Type B vocabulary for speech maintenance, Type C vocabulary for negotiation and persuasion and Type D vocabulary for novice learners.

Originality/value

These findings provide clues for better understanding L2 English vocabulary usage among Asian learners during speech.

Details

PSU Research Review, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2399-1747

Keywords

Open Access
Article
Publication date: 4 August 2020

Mohamed Boudchiche and Azzeddine Mazroui

We have developed in this paper a morphological disambiguation hybrid system for the Arabic language that identifies the stem, lemma and root of a given sentence words. Following…

Abstract

We have developed in this paper a morphological disambiguation hybrid system for the Arabic language that identifies the stem, lemma and root of a given sentence words. Following an out-of-context analysis performed by the morphological analyser Alkhalil Morpho Sys, the system first identifies all the potential tags of each word of the sentence. Then, a disambiguation phase is carried out to choose for each word the right solution among those obtained during the first phase. This problem has been solved by equating the disambiguation issue with a surface optimization problem of spline functions. Tests have shown the interest of this approach and the superiority of its performances compared to those of the state of the art.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

1 – 10 of 404