Search results

1 – 10 of 103
Article
Publication date: 21 September 2012

Dan Wu and Daqing He

This paper seeks to examine the further integration of machine translation technologies with cross language information access in providing web users the capabilities of accessing

1049

Abstract

Purpose

This paper seeks to examine the further integration of machine translation technologies with cross language information access in providing web users the capabilities of accessing information beyond language barriers. Machine translation and cross language information access are related technologies, and yet they have their own unique contributions in handling information in multiple languages. This paper aims to demonstrate that there are many opportunities to further integrate machine translation with cross language information access, and the combination can greatly empower web users in their information access.

Design/methodology/approach

Using English and Chinese as the language pair for studying, this paper looks at machine translation in query translation‐based cross language information access at multiple important aspects, which include query translation, relevance feedback, interactive cross language information access, out‐of‐vocabulary term translation, and data fusion. The goal is to obtain more insights about the wide range usages of machine translation in cross language information access, and to help the community to identify promising future directions for both machine translation and cross language access.

Findings

Machine translation can be applied effectively in many places in the whole cross language information access process. Queries translated by a machine translation system are high quality and are more robust in handling potential untranslated terms. Translation enhancement, a relevance feedback method using machine translation generated returned documents, is not only a valid technique by itself, but also helps to generate more robust cross language information access performance when combined with other relevance feedback techniques. Machine translation is also found to play a significant role in resolving untranslated terms and in data fusion.

Originality/value

This set of comparative empirical studies on integrating machine translation and cross language information access was performed on a common evaluation framework, and examined integration at multiple points of the cross language access process. The experimental results demonstrate the value of further integrating machine translation in cross language information access, and identify interesting future directions for both machine translation and cross language information access research.

Article
Publication date: 6 April 2012

Daniela Petrelli and Paul Clough

This paper aims to describe a study of the queries generated from a user experiment for cross‐language information retrieval (CLIR) from a historic image archive.

1308

Abstract

Purpose

This paper aims to describe a study of the queries generated from a user experiment for cross‐language information retrieval (CLIR) from a historic image archive.

Design/methodology/approach

A controlled lab‐based user study was carried out using a prototype Italian‐English image retrieval system. Participants were asked to carry out searches for 16 images provided to them, a known‐item search task. Italian speaking users generated 618 queries for a set of known‐item search tasks. User's interactions with the system were recorded and queries were analysed manually quantitatively and qualitatively. The queries generated by user's interaction with the system were analysed and the results used to suggest recommendations for the future development of cross‐language retrieval systems for digital image libraries.

Findings

Results highlight the diversity in requests for similar visual content and the weaknesses of machine translation for query translation. Through the manual translation of queries the authors show the benefits of using high‐quality translation resources. The results show the individual characteristics of users while performing known‐item searches and the overlap obtained between query terms and structured image captions, highlighting the use of user's search terms for objects within the foreground of an image.

Research limitations/implications

This research looks in depth into one case of interaction and one image repository. Despite this limitation, the discussed results are likely to be valid across other languages and image repositories.

Practical implications

To develop effective systems requires studying user's search behaviours, particularly in digital image libraries.

Originality/value

The growing quantity of digital visual material in digital libraries offers the potential to apply techniques from CLIR to provide cross‐language information access services. The value of this paper is in the provision of empirical evidence to support recommendations for effective cross‐language image retrieval system design.

Article
Publication date: 31 August 2012

Dan Wu, Daqing He and Xiaomei Xu

With the vast amount of multilingual information available online, it becomes increasingly critical for libraries to use various multilingual information access techniques in…

Abstract

Purpose

With the vast amount of multilingual information available online, it becomes increasingly critical for libraries to use various multilingual information access techniques in order to effectively support patrons' online information requests. However, this is still a relatively under‐explored area. This paper aims to study the effectiveness and the adoptability of query expansion and translation enhancement in the context of interactive multilingual information access.

Design/methodology/approach

Relying on an interactive multilingual information access system called ICE‐TEA, the authors conducted a controlled experiment (English‐to‐Chinese translation) involving human subjects to assess the retrieval effectiveness, analyzed the collected search logs to examine users' behavior, and employed pre‐ and post‐questionnaires to obtain users' opinions about the system.

Findings

The results confirm that significant improvement in retrieval effectiveness can be achieved by combining query expansion with translation enhancement (as compared to a case when there is no relevance feedback). However, users' ability to understand, interact with and even perceive the complex process of searches involving the combination of query expansion and translation enhancement may greatly impact the effectiveness of the techniques. The results also confirm that human‐generated queries were short queries, which calls for careful consideration of how longer queries perform in real search because many search engines rely on longer and more complex queries.

Originality/value

This study examines two important relevance feedback techniques in the context of human‐involved multilingual information access. This study is a valuable addition to the information seeking behaviour literature.

Article
Publication date: 18 July 2016

Dong Zhou, Séamus Lawless, Xuan Wu, Wenyu Zhao and Jianxun Liu

With an increase in the amount of multilingual content on the World Wide Web, users are often striving to access information provided in a language of which they are non-native…

1144

Abstract

Purpose

With an increase in the amount of multilingual content on the World Wide Web, users are often striving to access information provided in a language of which they are non-native speakers. The purpose of this paper is to present a comprehensive study of user profile representation techniques and investigate their use in personalized cross-language information retrieval (CLIR) systems through the means of personalized query expansion.

Design/methodology/approach

The user profiles consist of weighted terms computed by using frequency-based methods such as tf-idf and BM25, as well as various latent semantic models trained on monolingual documents and cross-lingual comparable documents. This paper also proposes an automatic evaluation method for comparing various user profile generation techniques and query expansion methods.

Findings

Experimental results suggest that latent semantic-weighted user profile representation techniques are superior to frequency-based methods, and are particularly suitable for users with a sufficient amount of historical data. The study also confirmed that user profiles represented by latent semantic models trained on a cross-lingual level gained better performance than the models trained on a monolingual level.

Originality/value

Previous studies on personalized information retrieval systems have primarily investigated user profiles and personalization strategies on a monolingual level. The effect of utilizing such monolingual profiles for personalized CLIR remains unclear. The current study fills the gap by a comprehensive study of user profile representation for personalized CLIR and a novel personalized CLIR evaluation methodology to ensure repeatable and controlled experiments can be conducted.

Details

Aslib Journal of Information Management, vol. 68 no. 4
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 5 June 2017

Li Si, Qiuyu Pan and Xiaozhe Zhuang

This paper aims to understand user information behaviours when they perform multilingual information retrieval. It also offers reference for the development of multilingual…

1619

Abstract

Purpose

This paper aims to understand user information behaviours when they perform multilingual information retrieval. It also offers reference for the development of multilingual information retrieval systems and relevant service platforms.

Design/methodology/approach

The authors designed an experiment on multilingual information retrieval with WorldWideScience, utilized Camtasia studio7 (a screen capturing and recording tool) to record overall operational processes of subjects and collected participants’ thought processes with think-aloud protocols. Meanwhile, a questionnaire survey and interviews were used to examine the subjects’ background information, their feelings for the experiment and their ideas about the experimental platform, respectively. Thirty-two valid data points were obtained by 41 subjects.

Findings

The users preferred their own language for retrieval. Most users from social science chose general search or advanced search freely according to the tasks. The majority of the participants selected key words directly from the tasks as search terms. Doctoral candidates were more likely to construct a search query with logic symbols. Translation tools were utilized for assisting retrieval and solving doubts of translation. When facing obstacles, users stayed on the original web page to explore continually, followed by back to homepage.

Originality/value

This paper provides a study of user behaviour through investigating how users behave on the whole process of retrieving multilingual information. The findings offer advice for optimizing the function of multilingual information retrieval systems and service platforms.

Details

The Electronic Library, vol. 35 no. 3
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 3 January 2020

Hany M. Alsalmi

Less attention has been paid to users’ interactions and behavior in studying multilingual search. Although digital library researchers have yet to assess user interaction and…

1064

Abstract

Purpose

Less attention has been paid to users’ interactions and behavior in studying multilingual search. Although digital library researchers have yet to assess user interaction and behavior in multilingual search, they have concurred that there is a need for user studies that document the extent to which information retrieval systems meet multilingual users’ needs and expectations. The paper aims to discuss these issues.

Design/methodology/approach

This study is composed of five individual cases. The case study participants were Saudi students enrolled either at a large state university or Historically Black College and University located in the same community. Research questions are, what do Saudi Digital Library (SDL) users experience when searching within the SDL in Arabic and English? And what strategies do they use if they fail to find resources? Data collected for this study were via a qualitative method called video-stimulated recall.

Findings

In the Arabic search tasks, participants realized that finding resources is not easy. Participants expressed their concerns about the lack of relevance and accuracy of results returned by the search system, indicating weak trust and confidence in the search system. Whereas in the English search task, participants felt more satisfied and confident in their ability to trust the results returned from the search system. Participants expressed their satisfaction in the search experience as it provided them with accurate and varying resources. The participants faced difficulties finding Arabic resources than English resources in the SDL.

Originality/value

This study is considered one of the earliest works in studying the information-seeking behavior of multilingual digital libraries in the Arabic language. The value of this study arises as being the first study to investigate and report the information-seeking behavior of SDL users.

Article
Publication date: 1 April 2022

Dan Wu, Shu Fan, Shengyi Yao and Shuang Xu

Ethnic minorities (EMs), who make up a sizable proportion of multilingual users, are more likely to browse and search in their native language. It is helpful to identify…

Abstract

Purpose

Ethnic minorities (EMs), who make up a sizable proportion of multilingual users, are more likely to browse and search in their native language. It is helpful to identify multilingual users' information needs to provide public digital cultural services (PDCS) for making their life better.

Design/methodology/approach

The in-context interview is an efficient way to explore EMs' information needs and evoke their daily experience with PDCS. The material from 31 one-on-one interviews with EMs in China was recorded and analyzed using thematic analysis.

Findings

The findings reveal that language proficiency is a critical factor influencing multilingual information access (MLIA) and multilingual users' information needs. Moreover, language ability, digital literacy and cultural literacy are important components of multilingual information literacy (MLIL), which is helpful for EMs to access PDCS. In light of Kochen's theory, the information needs of PDCS can be classified into the aroused need of resources, the recognized need of functions and services and expressed need. For the expressed need, it is necessary to develop a one-stop convergence platform of PDCS to process various requests of resources, functions and services in the future.

Originality/value

The findings will be valuable for governments, public institutions and social organizations in identifying, addressing and resolving these issues about PDCS.

Details

Journal of Documentation, vol. 79 no. 1
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 6 April 2012

Dan Wu, Daqing He and Bo Luo

This study aims to survey academic users in order to identify their needs and expectations about multilingual information processing when they interact with digital libraries. The…

2136

Abstract

Purpose

This study aims to survey academic users in order to identify their needs and expectations about multilingual information processing when they interact with digital libraries. The study specifically aims to determine the disparities in needs and expectations when users speak different languages.

Design/methodology/approach

A survey was designed to fill in the gaps in the knowledge about academic users' multilingual needs and expectations for digital libraries. The survey questionnaire incorporates questions about different aspects of the participants' multilingual needs and expectations covering multilingual needs, the multilingual behavior, often‐used multilingual information resources, and desired functions for the multilingual services, retrieval and interfaces in digital libraries. The results are obtained through statistical analyses and clustering methods.

Findings

Overall, participants exhibited many multilingual needs during their academic activities. They often require multilingual information when they access academic databases or web information. Frequently, participants use online translation resources and tools, but they are not satisfied with the translation quality. Participants want many multilingual capabilities in digital libraries; they also want more sophisticated multilingual search interfaces. However, participants from different countries or who speak different languages show significant differences in their multilingual needs and expectations of digital libraries. This study's three user groups demonstrated clear differences in all aspects of multilinguality examined, as did the three latent groups identified through the clustering methods.

Originality/value

Few studies have examined the multilingual information process in digital libraries from the point of view of academic users. This study draws its inputs directly from real academic users from different countries and provides insights into multilinguality in digital libraries.

Details

The Electronic Library, vol. 30 no. 2
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 6 April 2012

Tina Budzise‐Weaver, Jiangping Chen and Mikhaela Mitchell

This study aims to understand key features of existing multilingual digital libraries and to suggest strategies for building and/or sustaining multilingual information access for…

3647

Abstract

Purpose

This study aims to understand key features of existing multilingual digital libraries and to suggest strategies for building and/or sustaining multilingual information access for digital libraries.

Design/methodology/approach

A case study approach was applied to examine four American multilingual digital libraries: Project Gutenberg, Meeting of Frontiers, The International Children's Digital Library, and the Latin American Open Archives Portal. This examination used a framework derived from digital library evaluation practice. The missions, goals, funding, partners, users, collections, services, and technologies of these digital libraries were analyzed to present their key multilingual features. The collaboration and crowdsourcing characteristics were highlighted and discussed.

Findings

These four multilingual libraries benefit substantially, both in the creation of the library and in its access, from the collaboration of groups domestic and international with different language expertise. For building the multilingual collection and services, some libraries involved both staff and users. For multilingual access to the collection, however, none of the libraries used machine translation or cross‐language information retrieval technologies.

Research limitations/implications

The four cases are all publicly available digital libraries in the USA. Their features may not be applicable to digital libraries in other countries or to commercial digital information services.

Practical implications

With the advancement of machine translation technologies and the wide application of social media, multilingual digital libraries may have even better opportunities to sustain their multilingual capabilities through crowdsourcing and the application of new technologies.

Originality/value

This study summarizes the key features of four existing multilingual digital libraries. It provides insights into important factors for building successful multilingual digital libraries. The suggested strategies may help digital library developers to design appropriate multilingual information access services.

Article
Publication date: 5 September 2008

Eija Airio

The aim of the current paper is to test whether query translation is beneficial in web retrieval.

Abstract

Purpose

The aim of the current paper is to test whether query translation is beneficial in web retrieval.

Design/methodology/approach

The language pairs were Finnish‐Swedish, English‐German and Finnish‐French. A total of 12‐18 participants were recruited for each language pair. Each participant performed four retrieval tasks. The author's aim was to compare the performance of the translated queries with that of the target language queries. Thus, the author asked participants to formulate a source language query and a target language query for each task. The source language queries were translated into the target language utilizing a dictionary‐based system. In English‐German, also machine translation was utilized. The author used Google as the search engine.

Findings

The results differed depending on the language pair. The author concluded that the dictionary coverage had an effect on the results. On average, the results of query‐translation were better than in the traditional laboratory tests.

Originality/value

This research shows that query translation in web is beneficial especially for users with moderate and non‐active language skills. This is valuable information for developers of cross‐language information retrieval systems.

Details

Journal of Documentation, vol. 64 no. 5
Type: Research Article
ISSN: 0022-0418

Keywords

1 – 10 of 103