Search results

1 – 10 of 529
Article
Publication date: 7 January 2021

Makoto Nakayama and Yun Wan

The purpose of this paper is to call researchers’ attention to cross-cultural research using online consumer reviews and multilingual textual analysis.

Abstract

Purpose

The purpose of this paper is to call researchers’ attention to cross-cultural research using online consumer reviews and multilingual textual analysis.

Design/methodology/approach

The authors discuss a selected literature review and the highlight of the four studies that show cross-cultural differences in online reviews on ethnic restaurants.

Findings

Applying multilingual textual analysis could prompt new venues to verify and expand future cross-cultural research in tourism and hospitality.

Originality/value

The paper introduces examples of multilingual textual analysis used for cross-cultural studies.

Details

International Journal of Culture, Tourism and Hospitality Research, vol. 15 no. 2
Type: Research Article
ISSN: 1750-6182

Keywords

Article
Publication date: 11 March 2014

Elaine Menard and Margaret Smithglass

The purpose of this paper is to present the results of the first phase of a research project that aims to develop a bilingual interface for the retrieval of digital images. The…

1173

Abstract

Purpose

The purpose of this paper is to present the results of the first phase of a research project that aims to develop a bilingual interface for the retrieval of digital images. The main objective of this extensive exploration was to identify the characteristics and functionalities of existing search interfaces and similar tools available for image retrieval.

Design/methodology/approach

An examination of 159 resources that offer image retrieval was carried out. First, general search functionalities offered by content-based image retrieval systems and text-based systems are described. Second, image retrieval in a multilingual context is explored. Finally, the search functionalities provided by four types of organisations (libraries, museums, image search engines and stock photography databases) are investigated.

Findings

The analysis of functionalities offered by online image resources revealed a very high degree of consistency within the types of resources examined. The resources found to be the most navigable and interesting to use were those built with standardised vocabularies combined with a clear, compact and efficient user interface. The analysis also highlights that many search engines are equipped with multiple language support features. A translation device, however, is implemented in only a few search engines.

Originality/value

The examination of best practices for image retrieval and the analysis of the real users' expectations, which will be obtained in the next phase of the research project, constitute the foundation upon which the search interface model that the authors propose to develop is based. It also provides valuable suggestions and guidelines for search engine researchers, designers and developers.

Article
Publication date: 29 December 2022

Thea Williamson and Aris Clemons

Little research has been done exploring the nature of multilingual students who are not categorized as English language learners (ELLs) in English language arts (ELA) classes…

Abstract

Purpose

Little research has been done exploring the nature of multilingual students who are not categorized as English language learners (ELLs) in English language arts (ELA) classes. This study about a group of multilingual girls in an ELA class led by a monolingual white teacher aims to show how, when a teacher makes space for translanguaging practices in ELA, multilingual students disrupt norms of English only.

Design/methodology/approach

The authors use reconstructive discourse analysis to understand translanguaging across a variety of linguistic productions for a group of four focal students. Data sources include fieldnotes from 29 classroom observations, writing samples and process documents and 8.5 h of recorded classroom discourse.

Findings

Students used multilingualism across a variety of discourse modes, frequently in spoken language and rarely in written work. Translanguaging was most present in small-group peer talk structures, where students did relationship building, generated ideas for writing and managed their writing agendas, including feelings about writing. In addition, Spanish served as “elevated vocabulary” in writing. Across discourse modes, translanguaging served to develop academic proficiency in writing.

Originality/value

The authors proposed a more expansive approach to data analysis in English-mostly cases – i.e. environments shaped by multilingual students in monolingual school contexts – to argue for anti-deficit approaches to literacy development for multilingual students. Analyzing classroom talk alongside literacy allows for a more nuanced understanding of translanguaging practices in academic writing. They also show how even monolingual teachers can disrupt monolingual hegemony in ELA classrooms with high populations of multilingual students.

Details

English Teaching: Practice & Critique, vol. 22 no. 1
Type: Research Article
ISSN: 1175-8708

Keywords

Article
Publication date: 8 July 2010

Elaine Ménard

This paper seeks to examine image retrieval within two different contexts: a monolingual context where the language of the query is the same as the indexing language and a…

1259

Abstract

Purpose

This paper seeks to examine image retrieval within two different contexts: a monolingual context where the language of the query is the same as the indexing language and a multilingual context where the language of the query is different from the indexing language. The study also aims to compare two different approaches for the indexing of ordinary images representing common objects: traditional image indexing with the use of a controlled vocabulary and free image indexing using uncontrolled vocabulary.

Design/methodology/approach

This research uses three data collection methods. An analysis of the indexing terms was employed in order to examine the multiplicity of term types assigned to images. A simulation of the retrieval process involving a set of 30 images was performed with 60 participants. The quantification of the retrieval performance of each indexing approach was based on the usability measures, that is, effectiveness, efficiency and satisfaction of the user. Finally, a questionnaire was used to gather information on searcher satisfaction during and after the retrieval process.

Findings

The results of this research are twofold. The analysis of indexing terms associated with all the 3,950 images provides a comprehensive description of the characteristics of the four non‐combined indexing forms used for the study. Also, the retrieval simulation results offers information about the relative performance of the six indexing forms (combined and non‐combined) in terms of their effectiveness, efficiency (temporal and human) and the image searcher's satisfaction.

Originality/value

The findings of the study suggest that, in the near future, the information systems could benefit from allowing an increased coexistence of controlled vocabularies and uncontrolled vocabularies, resulting from collaborative image tagging, for example, and giving the users the possibility to dynamically participate in the image‐indexing process, in a more user‐centred way.

Details

Aslib Proceedings, vol. 62 no. 4/5
Type: Research Article
ISSN: 0001-253X

Keywords

Article
Publication date: 6 January 2022

Hanan Alghamdi and Ali Selamat

With the proliferation of terrorist/extremist websites on the World Wide Web, it has become progressively more crucial to detect and analyze the content on these websites…

Abstract

Purpose

With the proliferation of terrorist/extremist websites on the World Wide Web, it has become progressively more crucial to detect and analyze the content on these websites. Accordingly, the volume of previous research focused on identifying the techniques and activities of terrorist/extremist groups, as revealed by their sites on the so-called dark web, has also grown.

Design/methodology/approach

This study presents a review of the techniques used to detect and process the content of terrorist/extremist sites on the dark web. Forty of the most relevant data sources were examined, and various techniques were identified among them.

Findings

Based on this review, it was found that methods of feature selection and feature extraction can be used as topic modeling with content analysis and text clustering.

Originality/value

At the end of the review, present the current state-of-the- art and certain open issues associated with Arabic dark Web content analysis.

Details

Data Technologies and Applications, vol. 56 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 1 November 2005

Mohamed Hammami, Youssef Chahir and Liming Chen

Along with the ever growingWeb is the proliferation of objectionable content, such as sex, violence, racism, etc. We need efficient tools for classifying and filtering undesirable…

Abstract

Along with the ever growingWeb is the proliferation of objectionable content, such as sex, violence, racism, etc. We need efficient tools for classifying and filtering undesirable web content. In this paper, we investigate this problem through WebGuard, our automatic machine learning based pornographic website classification and filtering system. Facing the Internet more and more visual and multimedia as exemplified by pornographic websites, we focus here our attention on the use of skin color related visual content based analysis along with textual and structural content based analysis for improving pornographic website filtering. While the most commercial filtering products on the marketplace are mainly based on textual content‐based analysis such as indicative keywords detection or manually collected black list checking, the originality of our work resides on the addition of structural and visual content‐based analysis to the classical textual content‐based analysis along with several major‐data mining techniques for learning and classifying. Experimented on a testbed of 400 websites including 200 adult sites and 200 non pornographic ones, WebGuard, our Web filtering engine scored a 96.1% classification accuracy rate when only textual and structural content based analysis are used, and 97.4% classification accuracy rate when skin color related visual content based analysis is driven in addition. Further experiments on a black list of 12 311 adult websites manually collected and classified by the French Ministry of Education showed that WebGuard scored 87.82% classification accuracy rate when using only textual and structural content‐based analysis, and 95.62% classification accuracy rate when the visual content‐based analysis is driven in addition. The basic framework of WebGuard can apply to other categorization problems of websites which combine, as most of them do today, textual and visual content.

Details

International Journal of Web Information Systems, vol. 1 no. 4
Type: Research Article
ISSN: 1744-0084

Keywords

Book part
Publication date: 4 December 2012

Matthias Görtz, Thomas Mandl, Katrin Werner and Christa Womser-Hacker

Purpose – Global cooperation between and within organisations has become essential for successful businesses. For the information management within such an international and…

Abstract

Purpose – Global cooperation between and within organisations has become essential for successful businesses. For the information management within such an international and necessarily multilingual environment, new challenges arise due to the diversity of the stakeholders and participants as well as due to the heterogeneity of approaches and traditions of information handling.

Design/methodology/approach – Key technologies like search technologies need to be adapted to support content in multiple languages and efficient access to it. Information processes need to be analysed while bearing in mind that problems may arise due to cross-cultural misunderstandings. The diversity requires appropriate treatment and appropriate methods in information systems in order to improve international information flows.

Findings – This chapter identifies some of these challenges and shows how they can be approached from an information science perspective. User-oriented research at the University of Hildesheim in the areas information retrieval, information seeking and human–computer interaction is presented.

Originality/value – Global enterprises and organisations may use this chapter to identify challenges and solutions for adapting their information technology to an international scale. Researchers who work on multilingual information access and intercultural aspects of information systems get an overview on some current research.

Details

Library and Information Science Trends and Research: Europe
Type: Book
ISBN: 978-1-78052-714-7

Keywords

Article
Publication date: 1 May 1993

Edmond Lassalle

The use of an information retrieval (IR) system would be easier if natural language processing were applied. There are essentially two different ways to use NLP techniques: as a…

Abstract

The use of an information retrieval (IR) system would be easier if natural language processing were applied. There are essentially two different ways to use NLP techniques: as a user interface coupled with a factual database, or as an integrated part of a system which deals with a textual database. In this paper, two approaches are presented, that of MGS, a commercialized system in use in France Télécom, and that of Telmi, a France Télécom research system. Telmi is an information retrieval system designed for use with medium sized databases of short text. The characteristics of the system include fine‐grained NLP, an open domain and large scale knowledge base, automated indexing based on conceptual representation of texts, and reusability of the NLP tools. The knowledge base is (semi) automatically extracted from a monolingual machine‐readable dictionary (MRD). Telmi is integrated into a production‐scale prototype which implements a Minitel Information Service (IS) for the use of the general public. France Télécom Minitel(i) and its problems are described, along with the solutions Telmi offers. The paper then goes on to describe how France Télécom intends to reuse, in a continuation of the present project, the Telmi tools in a multilingual system, particularly in (semi)automatic data acquisition from multilingual MRDs.

Details

Aslib Proceedings, vol. 45 no. 5
Type: Research Article
ISSN: 0001-253X

Article
Publication date: 25 October 2022

Victor Diogho Heuer de Carvalho and Ana Paula Cabral Seixas Costa

This article presents two Brazilian Portuguese corpora collected from different media concerning public security issues in a specific location. The primary motivation is…

Abstract

Purpose

This article presents two Brazilian Portuguese corpora collected from different media concerning public security issues in a specific location. The primary motivation is supporting analyses, so security authorities can make appropriate decisions about their actions.

Design/methodology/approach

The corpora were obtained through web scraping from a newspaper's website and tweets from a Brazilian metropolitan region. Natural language processing was applied considering: text cleaning, lemmatization, summarization, part-of-speech and dependencies parsing, named entities recognition, and topic modeling.

Findings

Several results were obtained based on the methodology used, highlighting some: an example of a summarization using an automated process; dependency parsing; the most common topics in each corpus; the forty named entities and the most common slogans were extracted, highlighting those linked to public security.

Research limitations/implications

Some critical tasks were identified for the research perspective, related to the applied methodology: the treatment of noise from obtaining news on their source websites, passing through textual elements quite present in social network posts such as abbreviations, emojis/emoticons, and even writing errors; the treatment of subjectivity, to eliminate noise from irony and sarcasm; the search for authentic news of issues within the target domain. All these tasks aim to improve the process to enable interested authorities to perform accurate analyses.

Practical implications

The corpora dedicated to the public security domain enable several analyses, such as mining public opinion on security actions in a given location; understanding criminals' behaviors reported in the news or even on social networks and drawing their attitudes timeline; detecting movements that may cause damage to public property and people welfare through texts from social networks; extracting the history and repercussions of police actions, crossing news with records on social networks; among many other possibilities.

Originality/value

The work on behalf of the corpora reported in this text represents one of the first initiatives to create textual bases in Portuguese, dedicated to Brazil's specific public security domain.

Details

Library Hi Tech, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 22 March 2022

Djamila Mohdeb, Meriem Laifa, Fayssal Zerargui and Omar Benzaoui

The present study was designed to investigate eight research questions that are related to the analysis and the detection of dialectal Arabic hate speech that targeted African…

Abstract

Purpose

The present study was designed to investigate eight research questions that are related to the analysis and the detection of dialectal Arabic hate speech that targeted African refugees and illegal migrants on the YouTube Algerian space.

Design/methodology/approach

The transfer learning approach which recently presents the state-of-the-art approach in natural language processing tasks has been exploited to classify and detect hate speech in Algerian dialectal Arabic. Besides, a descriptive analysis has been conducted to answer the analytical research questions that aim at measuring and evaluating the presence of the anti-refugee/migrant discourse on the YouTube social platform.

Findings

Data analysis revealed that there has been a gradual modest increase in the number of anti-refugee/migrant hateful comments on YouTube since 2014, a sharp rise in 2017 and a sharp decline in later years until 2021. Furthermore, our findings stemming from classifying hate content using multilingual and monolingual pre-trained language transformers demonstrate a good performance of the AraBERT monolingual transformer in comparison with the monodialectal transformer DziriBERT and the cross-lingual transformers mBERT and XLM-R.

Originality/value

Automatic hate speech detection in languages other than English is quite a challenging task that the literature has tried to address by various approaches of machine learning. Although the recent approach of cross-lingual transfer learning offers a promising solution, tackling this problem in the context of the Arabic language, particularly dialectal Arabic makes it even more challenging. Our results cast a new light on the actual ability of the transfer learning approach to deal with low-resource languages that widely differ from high-resource languages as well as other Latin-based, low-resource languages.

Details

Aslib Journal of Information Management, vol. 74 no. 6
Type: Research Article
ISSN: 2050-3806

Keywords

1 – 10 of 529