Search results

1 – 10 of over 12000
Article
Publication date: 18 July 2016

Dong Zhou, Séamus Lawless, Xuan Wu, Wenyu Zhao and Jianxun Liu

With an increase in the amount of multilingual content on the World Wide Web, users are often striving to access information provided in a language of which they are non-native…

1144

Abstract

Purpose

With an increase in the amount of multilingual content on the World Wide Web, users are often striving to access information provided in a language of which they are non-native speakers. The purpose of this paper is to present a comprehensive study of user profile representation techniques and investigate their use in personalized cross-language information retrieval (CLIR) systems through the means of personalized query expansion.

Design/methodology/approach

The user profiles consist of weighted terms computed by using frequency-based methods such as tf-idf and BM25, as well as various latent semantic models trained on monolingual documents and cross-lingual comparable documents. This paper also proposes an automatic evaluation method for comparing various user profile generation techniques and query expansion methods.

Findings

Experimental results suggest that latent semantic-weighted user profile representation techniques are superior to frequency-based methods, and are particularly suitable for users with a sufficient amount of historical data. The study also confirmed that user profiles represented by latent semantic models trained on a cross-lingual level gained better performance than the models trained on a monolingual level.

Originality/value

Previous studies on personalized information retrieval systems have primarily investigated user profiles and personalization strategies on a monolingual level. The effect of utilizing such monolingual profiles for personalized CLIR remains unclear. The current study fills the gap by a comprehensive study of user profile representation for personalized CLIR and a novel personalized CLIR evaluation methodology to ensure repeatable and controlled experiments can be conducted.

Details

Aslib Journal of Information Management, vol. 68 no. 4
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 28 September 2010

Veronica Maidel, Peretz Shoval, Bracha Shapira and Meirav Taieb‐Maimon

The purpose of this paper is to describe a new ontological content‐based filtering method for ranking the relevance of items for readers of news items, and its evaluation. The…

Abstract

Purpose

The purpose of this paper is to describe a new ontological content‐based filtering method for ranking the relevance of items for readers of news items, and its evaluation. The method has been implemented in ePaper, a personalised electronic newspaper prototype system. The method utilises a hierarchical ontology of news; it considers common and related concepts appearing in a user's profile on the one hand, and in a news item's profile on the other hand, and measures the “hierarchical distances” between these concepts. On that basis it computes the similarity between item and user profiles and rank‐orders the news items according to their relevance to each user.

Design/methodology/approach

The paper evaluates the performance of the filtering method in an experimental setting. Each participant read news items obtained from an electronic newspaper and rated their relevance. Independently, the filtering method is applied to the same items and generated, for each participant, a list of news items ranked according to relevance.

Findings

The results of the evaluations revealed that the filtering algorithm, which takes into consideration hierarchically related concepts, yielded significantly better results than a filtering method that takes only common concepts into consideration. The paper determined a best set of values (weights) of the hierarchical similarity parameters. It also found out that the quality of filtering improves as the number of items used for implicit updates of the profile increases, and that even with implicitly updated profiles, it is better to start with user‐defined profiles.

Originality/value

The proposed content‐based filtering method can be used for filtering not only news items but items from any domain, and not only with a three‐level hierarchical ontology but any‐level ontology, in any language.

Details

Online Information Review, vol. 34 no. 5
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 1 November 2006

Yuval Elovici, Bracha Shapira and Adlay Meshiach

The purpose of this paper is to prove the ability of PRivAte Web (PRAW) – a system for private web browsing – to stand possible attacks.

Abstract

Purpose

The purpose of this paper is to prove the ability of PRivAte Web (PRAW) – a system for private web browsing – to stand possible attacks.

Design/methodology/approach

Attacks on the systems were simulated, manipulating systems variables. A privacy measure was defined to evaluate the capability of the systems to stand the attacks. Analysis of results was performed.

Findings

It was shown that, even if the attack is optimised to provide the attacker's highest utility, the similarity between the user profile and the approximated profile is pretty low and does not enable the eavesdropper to derive an accurate estimation of the user profile.

Research limitations/implications

One limitation is the “cold start” problem – in the current version, an observer might detect the first transaction, which is always a real user transaction. As a remedy for this problem, the first transaction will be randomly delayed and a random number of fake transactions played before the real one (according to Tr). Another limitation is that PRAW supports only link browsing, originated in search engine interactions (since it is the most common interaction on the web. It should be extended to include concealment of browsing to links originating in the “Favourites” list, that users tend to browse regularly (even a few times a day) for professional or personal reasons.

Practical implications

PRAW is feasible and preserves the privacy of web browsers. It is now undergoing commercialisation to become a shelf tool for privacy preservation.

Originality/value

The paper presents a practical statistical method for privacy preservation and proved that it is standing possible attacks. Methods usually proposed for this problem are not statistical, but cryptography oriented, and are too expensive in processing‐time to be practical.

Details

Online Information Review, vol. 30 no. 6
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 7 August 2009

Shihchieh Chou and Weiping Chang

The purpose of this paper is to identify distinguishing term characteristics from among the information of term appearance situations (tas) residing in the relevant/irrelevant…

Abstract

Purpose

The purpose of this paper is to identify distinguishing term characteristics from among the information of term appearance situations (tas) residing in the relevant/irrelevant documents retrieved for use. Terms with specific characteristics could be used in the distinguishing of user profiles, documents, pages or concepts to assist in information retrieval.

Design/methodology/approach

First, a method to apply the potential term characteristics in the distinguishing of user profiles in the information retrieval environment is designed. Then, an information retrieval system is developed to demonstrate the realisation and sustain the study of the method. Formal tests are conducted to examine the distinguishing capability of the potential term characteristics proposed in the method.

Findings

The results of the tests show that the potential term characteristics proposed in this study are successfully applied in the distinguishing of user profiles in the information retrieval environment.

Originality/value

Identification of distinguishing term characteristics would expand the ground for the IR community in the design of feature‐extraction algorithms or systems that try to cull information from structured or unstructured documents.

Details

Online Information Review, vol. 33 no. 4
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 30 May 2013

Yuangen Lai and Jianxun Zeng

The purpose of this paper is to develop a cross‐language personalized recommendation model based on web log mining, which can recommend academic articles, in different languages…

Abstract

Purpose

The purpose of this paper is to develop a cross‐language personalized recommendation model based on web log mining, which can recommend academic articles, in different languages, to users according to their demands.

Design/methodology/approach

The proposed model takes advantage of web log data archived in digital libraries and learns user profiles by means of integration analysis of a user's multiple online behaviors. Moreover, keyword translation was carried out to eliminate language dissimilarity between user and item profiles. Finally, article recommendation can be achieved using various existing algorithms.

Findings

The proposed model can recommend articles in different languages to users according to their demands, and the integration analysis of multiple online behaviors can help to better understand a user's interests.

Practical implications

This study has a significant implication for digital libraries in non‐English countries, since English is the most popular language in current academic articles and it is a very common phenomenon for users in these countries to obtain literatures presented by more than one language. Furthermore, this approach is also useful for other text‐based item recommendation systems.

Originality/value

A lot of research work has been done in the personalized recommendation area, but few works have discussed the recommendation problem under multiple linguistic circumstances. This paper deals with cross‐language recommendation and, moreover, the proposed model puts forward an integration analysis method based on multiple online behaviors to understand users' interests, which can provide references for other recommendation systems in the digital age.

Article
Publication date: 7 August 2017

Qiangbing Wang, Shutian Ma and Chengzhi Zhang

Based on user-generated content from a Chinese social media platform, this paper aims to investigate multiple methods of constructing user profiles and their effectiveness in…

Abstract

Purpose

Based on user-generated content from a Chinese social media platform, this paper aims to investigate multiple methods of constructing user profiles and their effectiveness in predicting their gender, age and geographic location.

Design/methodology/approach

This investigation collected 331,634 posts from 4,440 users of Sina Weibo. The data were divided into two parts, for training and testing . First, a vector space model and topic models were applied to construct user profiles. A classification model was then learned by a support vector machine according to the training data set. Finally, we used the classification model to predict users’ gender, age and geographic location in the testing data set.

Findings

The results revealed that in constructing user profiles, latent semantic analysis performed better on the task of predicting gender and age. By contrast, the method based on a traditional vector space model worked better in making predictions regarding the geographic location. In the process of applying a topic model to construct user profiles, the authors found that different prediction tasks should use different numbers of topics.

Originality/value

This study explores different user profile construction methods to predict Chinese social media network users’ gender, age and geographic location. The results of this paper will help to improve the quality of personal information gathered from social media platforms, and thereby improve personalized recommendation systems and personalized marketing.

Details

The Electronic Library, vol. 35 no. 4
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 10 October 2007

Alexander Smirnov, Tatiana Levashova, Mikhail Pashkin, Nikolai Shilov and Anna Komarova

This paper aims to present an approach to decision‐making in disaster response operations. The approach is based on ontology‐driven knowledge sharing and application of…

1148

Abstract

Purpose

This paper aims to present an approach to decision‐making in disaster response operations. The approach is based on ontology‐driven knowledge sharing and application of well‐developed tasks from the area of production network management, that in turn, enables using the existing problem‐solving methods and tools.

Design/methodology/approach

The approach applies the decision‐making tasks used in production network management to solving the above‐mentioned problem.

Findings

It is shown that there exist many common features and requirements for decision‐making in industrial environment and in disaster relief operations. They both require applying such technologies as ontology and context management, constraint satisfaction and profiling. Sample tasks used in the considered problem domains are presented.

Originality/value

The described research is a step forward in extension of integrating relatively well‐developed technologies implemented in production networks to the quite new areas of disaster relief and humanitarian logistics.

Details

Management Research News, vol. 30 no. 11
Type: Research Article
ISSN: 0140-9174

Keywords

Article
Publication date: 19 June 2017

Wondwossen Mulualem Beyene

Accessibility metadata has been a recurring theme in recent efforts aimed at promoting accessibility of information and communication technology solutions to all, regardless of…

2856

Abstract

Purpose

Accessibility metadata has been a recurring theme in recent efforts aimed at promoting accessibility of information and communication technology solutions to all, regardless of their disabilities, cultural differences, language, etc. The purpose of this paper is to explore the potential of accessibility metadata in improving knowledge discovery and access in digital library environments, discuss developments in creating accessibility terms for resource description, and attempt to relate those developments to the overall purpose of universal design to finally recommend points for improvement.

Design/methodology/approach

This is an exploratory study based on review of selected literature and documentations made available by metadata projects. Search for related literature was made using Google Scholar, EBSCO, and Web of Science Databases using terms and combination of terms such as “universal design and metadata,” “accessibility metadata,” “inclusive design,” and “metadata and digital libraries.” Some documentation on metadata projects were obtained through e-mail correspondences.

Findings

The overall discussion shows that accessibility metadata can be instrumental in exposing accessible resources to search engines and in augmenting library resource discovery tools for the benefit of users with disabilities. Accessibility metadata would help users to quickly discover materials that fit their needs. However, the notion of indexing resources by their accessibility attributes remains an area that needs further exploration.

Originality/value

The paper gives emphasis to the importance of metadata research in universal design endeavors. It also provides recommendations for practical applications that would improve accessibility in digital library environments.

Article
Publication date: 25 October 2022

Victor Diogho Heuer de Carvalho and Ana Paula Cabral Seixas Costa

This article presents two Brazilian Portuguese corpora collected from different media concerning public security issues in a specific location. The primary motivation is…

Abstract

Purpose

This article presents two Brazilian Portuguese corpora collected from different media concerning public security issues in a specific location. The primary motivation is supporting analyses, so security authorities can make appropriate decisions about their actions.

Design/methodology/approach

The corpora were obtained through web scraping from a newspaper's website and tweets from a Brazilian metropolitan region. Natural language processing was applied considering: text cleaning, lemmatization, summarization, part-of-speech and dependencies parsing, named entities recognition, and topic modeling.

Findings

Several results were obtained based on the methodology used, highlighting some: an example of a summarization using an automated process; dependency parsing; the most common topics in each corpus; the forty named entities and the most common slogans were extracted, highlighting those linked to public security.

Research limitations/implications

Some critical tasks were identified for the research perspective, related to the applied methodology: the treatment of noise from obtaining news on their source websites, passing through textual elements quite present in social network posts such as abbreviations, emojis/emoticons, and even writing errors; the treatment of subjectivity, to eliminate noise from irony and sarcasm; the search for authentic news of issues within the target domain. All these tasks aim to improve the process to enable interested authorities to perform accurate analyses.

Practical implications

The corpora dedicated to the public security domain enable several analyses, such as mining public opinion on security actions in a given location; understanding criminals' behaviors reported in the news or even on social networks and drawing their attitudes timeline; detecting movements that may cause damage to public property and people welfare through texts from social networks; extracting the history and repercussions of police actions, crossing news with records on social networks; among many other possibilities.

Originality/value

The work on behalf of the corpora reported in this text represents one of the first initiatives to create textual bases in Portuguese, dedicated to Brazil's specific public security domain.

Details

Library Hi Tech, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 0737-8831

Keywords

Abstract

Details

Automated Information Retrieval: Theory and Methods
Type: Book
ISBN: 978-0-12266-170-9

1 – 10 of over 12000