Search results

1 – 10 of over 2000
Article
Publication date: 21 September 2012

Dan Wu and Daqing He

This paper seeks to examine the further integration of machine translation technologies with cross language information access in providing web users the capabilities of accessing…

1048

Abstract

Purpose

This paper seeks to examine the further integration of machine translation technologies with cross language information access in providing web users the capabilities of accessing information beyond language barriers. Machine translation and cross language information access are related technologies, and yet they have their own unique contributions in handling information in multiple languages. This paper aims to demonstrate that there are many opportunities to further integrate machine translation with cross language information access, and the combination can greatly empower web users in their information access.

Design/methodology/approach

Using English and Chinese as the language pair for studying, this paper looks at machine translation in query translation‐based cross language information access at multiple important aspects, which include query translation, relevance feedback, interactive cross language information access, out‐of‐vocabulary term translation, and data fusion. The goal is to obtain more insights about the wide range usages of machine translation in cross language information access, and to help the community to identify promising future directions for both machine translation and cross language access.

Findings

Machine translation can be applied effectively in many places in the whole cross language information access process. Queries translated by a machine translation system are high quality and are more robust in handling potential untranslated terms. Translation enhancement, a relevance feedback method using machine translation generated returned documents, is not only a valid technique by itself, but also helps to generate more robust cross language information access performance when combined with other relevance feedback techniques. Machine translation is also found to play a significant role in resolving untranslated terms and in data fusion.

Originality/value

This set of comparative empirical studies on integrating machine translation and cross language information access was performed on a common evaluation framework, and examined integration at multiple points of the cross language access process. The experimental results demonstrate the value of further integrating machine translation in cross language information access, and identify interesting future directions for both machine translation and cross language information access research.

Article
Publication date: 6 April 2012

Daniela Petrelli and Paul Clough

This paper aims to describe a study of the queries generated from a user experiment for cross‐language information retrieval (CLIR) from a historic image archive.

1305

Abstract

Purpose

This paper aims to describe a study of the queries generated from a user experiment for cross‐language information retrieval (CLIR) from a historic image archive.

Design/methodology/approach

A controlled lab‐based user study was carried out using a prototype Italian‐English image retrieval system. Participants were asked to carry out searches for 16 images provided to them, a known‐item search task. Italian speaking users generated 618 queries for a set of known‐item search tasks. User's interactions with the system were recorded and queries were analysed manually quantitatively and qualitatively. The queries generated by user's interaction with the system were analysed and the results used to suggest recommendations for the future development of cross‐language retrieval systems for digital image libraries.

Findings

Results highlight the diversity in requests for similar visual content and the weaknesses of machine translation for query translation. Through the manual translation of queries the authors show the benefits of using high‐quality translation resources. The results show the individual characteristics of users while performing known‐item searches and the overlap obtained between query terms and structured image captions, highlighting the use of user's search terms for objects within the foreground of an image.

Research limitations/implications

This research looks in depth into one case of interaction and one image repository. Despite this limitation, the discussed results are likely to be valid across other languages and image repositories.

Practical implications

To develop effective systems requires studying user's search behaviours, particularly in digital image libraries.

Originality/value

The growing quantity of digital visual material in digital libraries offers the potential to apply techniques from CLIR to provide cross‐language information access services. The value of this paper is in the provision of empirical evidence to support recommendations for effective cross‐language image retrieval system design.

Article
Publication date: 31 August 2012

Dan Wu, Daqing He and Xiaomei Xu

With the vast amount of multilingual information available online, it becomes increasingly critical for libraries to use various multilingual information access techniques in…

Abstract

Purpose

With the vast amount of multilingual information available online, it becomes increasingly critical for libraries to use various multilingual information access techniques in order to effectively support patrons' online information requests. However, this is still a relatively under‐explored area. This paper aims to study the effectiveness and the adoptability of query expansion and translation enhancement in the context of interactive multilingual information access.

Design/methodology/approach

Relying on an interactive multilingual information access system called ICE‐TEA, the authors conducted a controlled experiment (English‐to‐Chinese translation) involving human subjects to assess the retrieval effectiveness, analyzed the collected search logs to examine users' behavior, and employed pre‐ and post‐questionnaires to obtain users' opinions about the system.

Findings

The results confirm that significant improvement in retrieval effectiveness can be achieved by combining query expansion with translation enhancement (as compared to a case when there is no relevance feedback). However, users' ability to understand, interact with and even perceive the complex process of searches involving the combination of query expansion and translation enhancement may greatly impact the effectiveness of the techniques. The results also confirm that human‐generated queries were short queries, which calls for careful consideration of how longer queries perform in real search because many search engines rely on longer and more complex queries.

Originality/value

This study examines two important relevance feedback techniques in the context of human‐involved multilingual information access. This study is a valuable addition to the information seeking behaviour literature.

Article
Publication date: 20 August 2018

Christian Olalla-Soler

The purpose of this paper is to investigate the use of electronic information resources to solve cultural translation problems at different stages of acquisition of the…

Abstract

Purpose

The purpose of this paper is to investigate the use of electronic information resources to solve cultural translation problems at different stages of acquisition of the translator’s cultural competence.

Design/methodology/approach

A process and product-oriented, cross-sectional, quasi-experimental study was conducted with 38 students with German as a second foreign language from the four years of the Bachelor’s degree in Translation and Interpreting at Universitat Autònoma de Barcelona, and ten professional translators.

Findings

Translation students use a wider variety of resources, perform more queries and spend more time on queries than translators when solving cultural translation problems. The students’ information-seeking process is generally less efficient than that of the translators. Training has little impact on the students’ use of electronic information resources for this specific purpose, since all students use them similarly regardless of the year they are in.

Research limitations/implications

The study has been conducted with a small sample and only one language pair from a single pedagogical context. The tendencies observed cannot be generalised to the whole population of translation students.

Practical implications

This paper has implications for translator training, as it encourages the development of efficient information-seeking processes for the resolution of cultural translation problems.

Originality/value

Unlike other studies, this paper focusses on a specific translation problem type. It provides information related to the students’ information-seeking strategies for the resolution of cultural translation problems, which can be useful for translation training.

Article
Publication date: 5 September 2008

Eija Airio

The aim of the current paper is to test whether query translation is beneficial in web retrieval.

Abstract

Purpose

The aim of the current paper is to test whether query translation is beneficial in web retrieval.

Design/methodology/approach

The language pairs were Finnish‐Swedish, English‐German and Finnish‐French. A total of 12‐18 participants were recruited for each language pair. Each participant performed four retrieval tasks. The author's aim was to compare the performance of the translated queries with that of the target language queries. Thus, the author asked participants to formulate a source language query and a target language query for each task. The source language queries were translated into the target language utilizing a dictionary‐based system. In English‐German, also machine translation was utilized. The author used Google as the search engine.

Findings

The results differed depending on the language pair. The author concluded that the dictionary coverage had an effect on the results. On average, the results of querytranslation were better than in the traditional laboratory tests.

Originality/value

This research shows that query translation in web is beneficial especially for users with moderate and non‐active language skills. This is valuable information for developers of cross‐language information retrieval systems.

Details

Journal of Documentation, vol. 64 no. 5
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 1 May 2006

Tuomas Talvensaari, Jorma Laurikkala, Kalervo Järvelin and Martti Juhola

To present a method for creating a comparable document collection from two document collections in different languages.

Abstract

Purpose

To present a method for creating a comparable document collection from two document collections in different languages.

Design/methodology/approach

The best query keys were extracted from a Finnish source collection (articles of the newspaper Aamulehti) with the relative average term frequency formula. The keys were translated into English with a dictionary‐based query translation program. The resulting lists of words were used as queries that were run against the target collection (Los Angeles Times articles) with the nearest neighbor method. The documents were aligned with unrestricted and date‐restricted alignment schemes, which were also combined.

Findings

The combined alignment scheme was found the best, when the relatedness of the document pairs was assessed with a five‐degree relevance scale. Of the 400 document pairs, roughly 40 percent were highly or fairly related and 75 percent included at least lexical similarity.

Research limitations/implications

The number of alignment pairs was small due to the short common time period of the two collections, and their geographical (and thus, topical) remoteness. In future, our aim is to build larger comparable corpora in various languages and use them as source of translation knowledge for the purposes of cross‐language information retrieval (CLIR).

Practical implications

Readily available parallel corpora are scarce. With this method, two unrelated document collections can relatively easily be aligned to create a CLIR resource.

Originality/value

The method can be applied to weakly linked collections and morphologically complex languages, such as Finnish.

Details

Journal of Documentation, vol. 62 no. 3
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 14 August 2007

Jin Zhang and Suyu Lin

This paper aims to investigate the multiple language support features in internet search engines. The diversity of the internet is reflected not only in its users, information…

1222

Abstract

Purpose

This paper aims to investigate the multiple language support features in internet search engines. The diversity of the internet is reflected not only in its users, information formats and information content, but also in the languages used. As more and more information becomes available in different languages, multiple language support in a search engine becomes more important.

Design/methodology/approach

The first step of this study is to conduct a survey about existing search engines and to identify search engines with multiple language support features. The second step is to analyse, compare, and characterise the multiple language support features in the selected search engines against the proposed five basic evaluation criteria after they are classified into three categories. Finally, the strengths and weaknesses of the multiple language support features in the selected search engines are discussed in detail.

Findings

The findings reveal that Google, EZ2Find, and Onlinelink respectively are the search engines with the best multiple language support features in their categories. Although many search engines are equipped with multiple language support features, an indispensable translation feature is implemented in only a few search engines. Multiple language support features in search engines remain at the lexical level.

Originality/value

The findings of the study will facilitate understanding of the current status of multiple language support in search engines, help users to effectively utilise multiple language support features in a search engine, and provide useful advice and suggestions for search engine researchers, designers and developers.

Details

Online Information Review, vol. 31 no. 4
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 27 April 2010

María‐Dolores Olvera‐Lobo and Lola García‐Santiago

This study aims to focus on the evaluation of systems for the automatic translation of questions destined to translingual question‐answer (QA) systems. The efficacy of online…

Abstract

Purpose

This study aims to focus on the evaluation of systems for the automatic translation of questions destined to translingual question‐answer (QA) systems. The efficacy of online translators when performing as tools in QA systems is analysed using a collection of documents in the Spanish language.

Design/methodology/approach

Automatic translation is evaluated in terms of the functionality of actual translations produced by three online translators (Google Translator, Promt Translator, and Worldlingo) by means of objective and subjective evaluation measures, and the typology of errors produced was identified. For this purpose, a comparative study of the quality of the translation of factual questions of the CLEF collection of queries was carried out, from German and French to Spanish.

Findings

It was observed that the rates of error for the three systems evaluated here are greater in the translations pertaining to the language pair German‐Spanish. Promt was identified as the most reliable translator of the three (on average) for the two linguistic combinations evaluated. However, for the Spanish‐German pair, a good assessment of the Google online translator was obtained as well. Most errors (46.38 percent) tended to be of a lexical nature, followed by those due to a poor translation of the interrogative particle of the query (31.16 percent).

Originality/value

The evaluation methodology applied focuses above all on the finality of the translation. That is, does the resulting question serve as effective input into a translingual QA system? Thus, instead of searching for “perfection”, the functionality of the question and its capacity to lead one to an adequate response are appraised. The results obtained contribute to the development of improved translingual QA systems.

Details

Journal of Documentation, vol. 66 no. 3
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 18 July 2016

Dong Zhou, Séamus Lawless, Xuan Wu, Wenyu Zhao and Jianxun Liu

With an increase in the amount of multilingual content on the World Wide Web, users are often striving to access information provided in a language of which they are non-native…

1143

Abstract

Purpose

With an increase in the amount of multilingual content on the World Wide Web, users are often striving to access information provided in a language of which they are non-native speakers. The purpose of this paper is to present a comprehensive study of user profile representation techniques and investigate their use in personalized cross-language information retrieval (CLIR) systems through the means of personalized query expansion.

Design/methodology/approach

The user profiles consist of weighted terms computed by using frequency-based methods such as tf-idf and BM25, as well as various latent semantic models trained on monolingual documents and cross-lingual comparable documents. This paper also proposes an automatic evaluation method for comparing various user profile generation techniques and query expansion methods.

Findings

Experimental results suggest that latent semantic-weighted user profile representation techniques are superior to frequency-based methods, and are particularly suitable for users with a sufficient amount of historical data. The study also confirmed that user profiles represented by latent semantic models trained on a cross-lingual level gained better performance than the models trained on a monolingual level.

Originality/value

Previous studies on personalized information retrieval systems have primarily investigated user profiles and personalization strategies on a monolingual level. The effect of utilizing such monolingual profiles for personalized CLIR remains unclear. The current study fills the gap by a comprehensive study of user profile representation for personalized CLIR and a novel personalized CLIR evaluation methodology to ensure repeatable and controlled experiments can be conducted.

Details

Aslib Journal of Information Management, vol. 68 no. 4
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 10 August 2015

Tomasz Neugebauer and Elaine Menard

This paper aims to present the third stage of a research project that aims to develop a bilingual interface for the retrieval of digital images. The requirements and…

Abstract

Purpose

This paper aims to present the third stage of a research project that aims to develop a bilingual interface for the retrieval of digital images. The requirements and implementation of the search engine are described. Image search engines attempt to give access to a range of online images available on the web.

Design/methodology/approach

The strategy of using open-source software components as much as possible was chosen for the advantages of this approach: low initial cost and accessibility to evaluate and develop enhancements independently and driven by research objectives rather than financial viability.

Findings

Open-source software components can be used to develop the interface. The implementation of the image search engine and its indexes uses: Apache Solr, AJAX-Solr, jsTree and jQuery. Microsoft Translator web service was integrated into the interface to provide the optional user query translation.

Originality/value

The search interface is intended to be an innovative tool for image searchers who are looking for digital images. The search interface gives the image searchers the opportunity to easily access a variety of visual resources and facilitates searching for images in two different languages (English and French).

Details

OCLC Systems & Services: International digital library perspectives, vol. 31 no. 3
Type: Research Article
ISSN: 1065-075X

Keywords

1 – 10 of over 2000