Search results

1 – 10 of 42

View access options

Article

Publication date: 6 April 2012

Analysing user's queries for cross‐language image retrieval from digital library collections

This paper aims to describe a study of the queries generated from a user experiment for cross‐language information retrieval (CLIR) from a historic image archive.

HTML

PDF (367 KB)

Downloads

1316

Abstract

Purpose

This paper aims to describe a study of the queries generated from a user experiment for cross‐language information retrieval (CLIR) from a historic image archive.

Design/methodology/approach

A controlled lab‐based user study was carried out using a prototype Italian‐English image retrieval system. Participants were asked to carry out searches for 16 images provided to them, a known‐item search task. Italian speaking users generated 618 queries for a set of known‐item search tasks. User's interactions with the system were recorded and queries were analysed manually quantitatively and qualitatively. The queries generated by user's interaction with the system were analysed and the results used to suggest recommendations for the future development of cross‐language retrieval systems for digital image libraries.

Findings

Results highlight the diversity in requests for similar visual content and the weaknesses of machine translation for query translation. Through the manual translation of queries the authors show the benefits of using high‐quality translation resources. The results show the individual characteristics of users while performing known‐item searches and the overlap obtained between query terms and structured image captions, highlighting the use of user's search terms for objects within the foreground of an image.

Research limitations/implications

This research looks in depth into one case of interaction and one image repository. Despite this limitation, the discussed results are likely to be valid across other languages and image repositories.

Practical implications

To develop effective systems requires studying user's search behaviours, particularly in digital image libraries.

Originality/value

The growing quantity of digital visual material in digital libraries offers the potential to apply techniques from CLIR to provide cross‐language information access services. The value of this paper is in the provision of empirical evidence to support recommendations for effective cross‐language image retrieval system design.

Details

The Electronic Library, vol. 30 no. 2

Type: Research Article

DOI:

ISSN: 0264-0473

Keywords

View access options

Article

Publication date: 18 July 2016

A study of user profile representation for personalized cross-language information retrieval

Dong Zhou, Séamus Lawless, Xuan Wu, Wenyu Zhao and Jianxun Liu

With an increase in the amount of multilingual content on the World Wide Web, users are often striving to access information provided in a language of which they are non-native…

HTML

PDF (446 KB)

Downloads

1161

Abstract

Purpose

With an increase in the amount of multilingual content on the World Wide Web, users are often striving to access information provided in a language of which they are non-native speakers. The purpose of this paper is to present a comprehensive study of user profile representation techniques and investigate their use in personalized cross-language information retrieval (CLIR) systems through the means of personalized query expansion.

Design/methodology/approach

The user profiles consist of weighted terms computed by using frequency-based methods such as tf-idf and BM25, as well as various latent semantic models trained on monolingual documents and cross-lingual comparable documents. This paper also proposes an automatic evaluation method for comparing various user profile generation techniques and query expansion methods.

Findings

Experimental results suggest that latent semantic-weighted user profile representation techniques are superior to frequency-based methods, and are particularly suitable for users with a sufficient amount of historical data. The study also confirmed that user profiles represented by latent semantic models trained on a cross-lingual level gained better performance than the models trained on a monolingual level.

Originality/value

Previous studies on personalized information retrieval systems have primarily investigated user profiles and personalization strategies on a monolingual level. The effect of utilizing such monolingual profiles for personalized CLIR remains unclear. The current study fills the gap by a comprehensive study of user profile representation for personalized CLIR and a novel personalized CLIR evaluation methodology to ensure repeatable and controlled experiments can be conducted.

Details

Aslib Journal of Information Management, vol. 68 no. 4

Type: Research Article

DOI:

ISSN: 2050-3806

Keywords

View access options

Article

Publication date: 31 August 2012

A study of relevance feedback techniques in interactive multilingual information access

Dan Wu, Daqing He and Xiaomei Xu

With the vast amount of multilingual information available online, it becomes increasingly critical for libraries to use various multilingual information access techniques in…

HTML

PDF (234 KB)

Downloads

796

Abstract

Purpose

With the vast amount of multilingual information available online, it becomes increasingly critical for libraries to use various multilingual information access techniques in order to effectively support patrons' online information requests. However, this is still a relatively under‐explored area. This paper aims to study the effectiveness and the adoptability of query expansion and translation enhancement in the context of interactive multilingual information access.

Design/methodology/approach

Relying on an interactive multilingual information access system called ICE‐TEA, the authors conducted a controlled experiment (English‐to‐Chinese translation) involving human subjects to assess the retrieval effectiveness, analyzed the collected search logs to examine users' behavior, and employed pre‐ and post‐questionnaires to obtain users' opinions about the system.

Findings

The results confirm that significant improvement in retrieval effectiveness can be achieved by combining query expansion with translation enhancement (as compared to a case when there is no relevance feedback). However, users' ability to understand, interact with and even perceive the complex process of searches involving the combination of query expansion and translation enhancement may greatly impact the effectiveness of the techniques. The results also confirm that human‐generated queries were short queries, which calls for careful consideration of how longer queries perform in real search because many search engines rely on longer and more complex queries.

Originality/value

This study examines two important relevance feedback techniques in the context of human‐involved multilingual information access. This study is a valuable addition to the information seeking behaviour literature.

Details

Library Hi Tech, vol. 30 no. 3

Type: Research Article

DOI:

ISSN: 0737-8831

Keywords

View access options

Article

Publication date: 6 April 2012

Multilinguality in the digital library: A review

Anne R. Diekema

Together, increasing globalization and the internet created fertile grounds for the establishment of multilingual digital libraries. Providing cross‐lingual access to materials is…

HTML

PDF (95 KB)

Downloads

2648

Abstract

Purpose

Together, increasing globalization and the internet created fertile grounds for the establishment of multilingual digital libraries. Providing cross‐lingual access to materials is of particular interest to political entities such as the European Union, which currently has 23 official languages, but also to multinational companies and countries that have different languages represented among their citizens. The main objective of this paper is to review the literature on multilingual digital libraries and provide an overview of this area.

Design/methodology/approach

Based on a thorough literature search in four different databases, a core set of literature on multilingual digital libraries was retrieved. Literature on various aspects of this topic was reviewed. The paper is organized based on emerging themes directly drawn from the literature. Where warranted additional literature is brought in to provide necessary background information or clarification.

Findings

Creating a multilingual digital library is a highly complex undertaking and typically requires a collaborative effort between different organizations and people with different areas of expertise. Enabling users to search across languages requires translation resources to cross the language barrier, which can be challenging depending on the language and resource availability. Additional challenges were found to be in data management (localization and language processing), representation (dealing with different fonts and character codes), development (creating international software, cross‐cultural collaboration), and interoperability (system architecture and data sharing). Research in multilingual digital libraries was mostly system based involving experimental systems or system prototypes.

Research limitations/implications

Most likely the literature review does not include all possible journal articles on multilingual digital libraries even though the literature searches done to obtain these articles were thorough and deliberate. Journal articles without the descriptors used in this search and those articles not indexed in the four different databases used in the search will not be included here. The review excludes cross‐language information retrieval research unless it is directly related to existing multilingual digital libraries, or a connection to digital libraries in general is made in the paper itself.

Originality/value

This paper provides the first literature review on the topic of multilingual digital libraries and provides a concise overview of relevant aspects in this area. The number of multilingual digital libraries is growing, as is the interest from the research community in these libraries to apply their research findings from cross‐language information retrieval. This review article provides a valuable entry point to the field of multilingual digital libraries for researchers, practitioners, and other interested parties.

Details

The Electronic Library, vol. 30 no. 2

Type: Research Article

DOI:

ISSN: 0264-0473

Keywords

View access options

Article

Publication date: 5 September 2008

Who benefits from CLIR in web retrieval?

Eija Airio

The aim of the current paper is to test whether query translation is beneficial in web retrieval.

HTML

PDF (279 KB)

Downloads

686

Abstract

Purpose

The aim of the current paper is to test whether query translation is beneficial in web retrieval.

Design/methodology/approach

The language pairs were Finnish‐Swedish, English‐German and Finnish‐French. A total of 12‐18 participants were recruited for each language pair. Each participant performed four retrieval tasks. The author's aim was to compare the performance of the translated queries with that of the target language queries. Thus, the author asked participants to formulate a source language query and a target language query for each task. The source language queries were translated into the target language utilizing a dictionary‐based system. In English‐German, also machine translation was utilized. The author used Google as the search engine.

Findings

The results differed depending on the language pair. The author concluded that the dictionary coverage had an effect on the results. On average, the results of query‐translation were better than in the traditional laboratory tests.

Originality/value

This research shows that query translation in web is beneficial especially for users with moderate and non‐active language skills. This is valuable information for developers of cross‐language information retrieval systems.

Details

Journal of Documentation, vol. 64 no. 5

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 3 January 2020

Information-seeking in multilingual digital libraries: Comparative case studies of five university students

Hany M. Alsalmi

Less attention has been paid to users’ interactions and behavior in studying multilingual search. Although digital library researchers have yet to assess user interaction and…

HTML

PDF (2.5 MB)

Downloads

1127

Abstract

Purpose

Less attention has been paid to users’ interactions and behavior in studying multilingual search. Although digital library researchers have yet to assess user interaction and behavior in multilingual search, they have concurred that there is a need for user studies that document the extent to which information retrieval systems meet multilingual users’ needs and expectations. The paper aims to discuss these issues.

Design/methodology/approach

This study is composed of five individual cases. The case study participants were Saudi students enrolled either at a large state university or Historically Black College and University located in the same community. Research questions are, what do Saudi Digital Library (SDL) users experience when searching within the SDL in Arabic and English? And what strategies do they use if they fail to find resources? Data collected for this study were via a qualitative method called video-stimulated recall.

Findings

In the Arabic search tasks, participants realized that finding resources is not easy. Participants expressed their concerns about the lack of relevance and accuracy of results returned by the search system, indicating weak trust and confidence in the search system. Whereas in the English search task, participants felt more satisfied and confident in their ability to trust the results returned from the search system. Participants expressed their satisfaction in the search experience as it provided them with accurate and varying resources. The participants faced difficulties finding Arabic resources than English resources in the SDL.

Originality/value

This study is considered one of the earliest works in studying the information-seeking behavior of multilingual digital libraries in the Arabic language. The value of this study arises as being the first study to investigate and report the information-seeking behavior of SDL users.

Details

Library Hi Tech, vol. 39 no. 1

Type: Research Article

DOI:

ISSN: 0737-8831

Keywords

View access options

Article

Publication date: 21 September 2012

Exploring the further integration of machine translation in English‐Chinese cross language information access

Dan Wu and Daqing He

This paper seeks to examine the further integration of machine translation technologies with cross language information access in providing web users the capabilities of accessing…

HTML

PDF (308 KB)

Downloads

1079

Abstract

Purpose

This paper seeks to examine the further integration of machine translation technologies with cross language information access in providing web users the capabilities of accessing information beyond language barriers. Machine translation and cross language information access are related technologies, and yet they have their own unique contributions in handling information in multiple languages. This paper aims to demonstrate that there are many opportunities to further integrate machine translation with cross language information access, and the combination can greatly empower web users in their information access.

Design/methodology/approach

Using English and Chinese as the language pair for studying, this paper looks at machine translation in query translation‐based cross language information access at multiple important aspects, which include query translation, relevance feedback, interactive cross language information access, out‐of‐vocabulary term translation, and data fusion. The goal is to obtain more insights about the wide range usages of machine translation in cross language information access, and to help the community to identify promising future directions for both machine translation and cross language access.

Findings

Machine translation can be applied effectively in many places in the whole cross language information access process. Queries translated by a machine translation system are high quality and are more robust in handling potential untranslated terms. Translation enhancement, a relevance feedback method using machine translation generated returned documents, is not only a valid technique by itself, but also helps to generate more robust cross language information access performance when combined with other relevance feedback techniques. Machine translation is also found to play a significant role in resolving untranslated terms and in data fusion.

Originality/value

This set of comparative empirical studies on integrating machine translation and cross language information access was performed on a common evaluation framework, and examined integration at multiple points of the cross language access process. The experimental results demonstrate the value of further integrating machine translation in cross language information access, and identify interesting future directions for both machine translation and cross language information access research.

Details

Program, vol. 46 no. 4

Type: Research Article

DOI:

ISSN: 0033-0337

Keywords

View access options

Article

Publication date: 1 May 2006

A study on automatic creation of a comparable document collection in cross‐language information retrieval

Tuomas Talvensaari, Jorma Laurikkala, Kalervo Järvelin and Martti Juhola

To present a method for creating a comparable document collection from two document collections in different languages.

HTML

PDF (191 KB)

Downloads

735

Abstract

Purpose

To present a method for creating a comparable document collection from two document collections in different languages.

Design/methodology/approach

The best query keys were extracted from a Finnish source collection (articles of the newspaper Aamulehti) with the relative average term frequency formula. The keys were translated into English with a dictionary‐based query translation program. The resulting lists of words were used as queries that were run against the target collection (Los Angeles Times articles) with the nearest neighbor method. The documents were aligned with unrestricted and date‐restricted alignment schemes, which were also combined.

Findings

The combined alignment scheme was found the best, when the relatedness of the document pairs was assessed with a five‐degree relevance scale. Of the 400 document pairs, roughly 40 percent were highly or fairly related and 75 percent included at least lexical similarity.

Research limitations/implications

The number of alignment pairs was small due to the short common time period of the two collections, and their geographical (and thus, topical) remoteness. In future, our aim is to build larger comparable corpora in various languages and use them as source of translation knowledge for the purposes of cross‐language information retrieval (CLIR).

Practical implications

Readily available parallel corpora are scarce. With this method, two unrelated document collections can relatively easily be aligned to create a CLIR resource.

Originality/value

The method can be applied to weakly linked collections and morphologically complex languages, such as Finnish.

Details

Journal of Documentation, vol. 62 no. 3

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 1 June 2001

Morphological typology of languages for IR

Ari Pirkola

This paper presents a morphological classification of languages from the IR perspective. Linguistic typology research has shown that the morphological complexity of every language…

HTML

PDF (443 KB)

Downloads

1144

Abstract

This paper presents a morphological classification of languages from the IR perspective. Linguistic typology research has shown that the morphological complexity of every language in the world can be described by two variables, index of synthesis and index of fusion. These variables provide a theoretical basis for IR research handling morphological issues. A common theoretical framework is needed in particular because of the increasing significance of cross‐language retrieval research and CLIR systems processing different languages. The paper elaborates the linguistic morphological typology for the purposes of IR research. It studies how the indexes of synthesis and fusion could be used as practical tools in mono‐ and cross‐lingual IR research. The need for semantic and syntactic typologies is discussed. The paper also reviews studies made in different languages on the effects of morphology and stemming in IR.

Details

Journal of Documentation, vol. 57 no. 3

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 19 October 2018

Mixed language queries in online searches: A study of intra-sentential code-switching from a qualitative perspective

Hengyi Fu

With the increasing number of online multilingual resources, cross-language information retrieval (CLIR) has drawn much attention from the information retrieval (IR) research…

HTML

PDF (201 KB)

Downloads

3038

Abstract

Purpose

With the increasing number of online multilingual resources, cross-language information retrieval (CLIR) has drawn much attention from the information retrieval (IR) research community. However, few studies have examined how and why multilingual searchers seek information in two or more languages, specifically how they switch and mix language in queries to get satisfying results. The purpose of this paper is to focus on Chinese–English bilinguals’ intra-sentential code-switching behaviors in online searches. The scenarios and reasons of code-switching, factors that may affect code-switching, the patterns of mixed language query formulation and reformulation and how current IR systems and other search tools can facilitate such information needs were examined.

Design/methodology/approach

In-depth semi-structured interviews were used as the research method. In total, 30 participants were recruited based on their English proficiency, location and profession, using a purposive sampling method.

Findings

Four scenarios and four reasons for using Chinese–English mixed language queries to cover information needs were identified, and results suggest that linguistic and cultural/social factors are of equivalent importance in code-switching behaviors. English terms and Chinese terms in queries play different roles in searches, and mixed language queries are irreplaceable by either single language queries or other search facilitating features. Findings also suggest current search engines and tools need greater emphasis in the user interface and more user education is required.

Originality/value

This study presents a qualitative analysis of bilinguals’ code-switching behaviors in online searches. Findings are expected to advance the theoretical understanding of bilingual users’ search strategies and interactions with IR systems, and provide insights for designing more effective IR systems and tools to discover multilingual online resources, including cross-language controlled vocabularies, personalized CLIR tools and mixed language query assistants.

Details

Aslib Journal of Information Management, vol. 71 no. 1

Type: Research Article

DOI:

ISSN: 2050-3806

Keywords

Access

Year

All dates (42)

Content type

1 – 10 of 42

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Abstract

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Access

Year

Content type

All feedback is valuable

Report an issue or find answers to frequently asked questions