Search results

1 – 10 of over 21000
To view the access options for this content please click here
Article
Publication date: 18 July 2016

Maayan Zhitomirsky-Geffet, Judit Bar-Ilan and Mark Levene

One of the under-explored aspects in the process of user information seeking behaviour is influence of time on relevance evaluation. It has been shown in previous studies…

Downloads
2066

Abstract

Purpose

One of the under-explored aspects in the process of user information seeking behaviour is influence of time on relevance evaluation. It has been shown in previous studies that individual users might change their assessment of search results over time. It is also known that aggregated judgements of multiple individual users can lead to correct and reliable decisions; this phenomenon is known as the “wisdom of crowds”. The purpose of this paper is to examine whether aggregated judgements will be more stable and thus more reliable over time than individual user judgements.

Design/methodology/approach

In this study two simple measures are proposed to calculate the aggregated judgements of search results and compare their reliability and stability to individual user judgements. In addition, the aggregated “wisdom of crowds” judgements were used as a means to compare the differences between human assessments of search results and search engine’s rankings. A large-scale user study was conducted with 87 participants who evaluated two different queries and four diverse result sets twice, with an interval of two months. Two types of judgements were considered in this study: relevance on a four-point scale, and ranking on a ten-point scale without ties.

Findings

It was found that aggregated judgements are much more stable than individual user judgements, yet they are quite different from search engine rankings.

Practical implications

The proposed “wisdom of crowds”-based approach provides a reliable reference point for the evaluation of search engines. This is also important for exploring the need of personalisation and adapting search engine’s ranking over time to changes in users preferences.

Originality/value

This is a first study that applies the notion of “wisdom of crowds” to examine an under-explored in the literature phenomenon of “change in time” in user evaluation of relevance.

Details

Aslib Journal of Information Management, vol. 68 no. 4
Type: Research Article
ISSN: 2050-3806

Keywords

To view the access options for this content please click here
Article
Publication date: 27 November 2007

David Bade

The purpose of this paper is to examine the significance of the differences between the actual technical principles determining relevance ranking, and how relevance ranking

Downloads
1289

Abstract

Purpose

The purpose of this paper is to examine the significance of the differences between the actual technical principles determining relevance ranking, and how relevance ranking is understood, described and evaluated by the developers of relevance ranking algorithms and librarians.

Design/methodology/approach

The discussion uses descriptions by PLWeb Turbo and C2 of their relevance ranking products and a librarian's description on her blog with the responses which it drew, contrasting these with relevancy as it is indicated in studies of the ISI citation record reported by White.

Findings

The study finds that product descriptions and librarians consistently use the term “relevance ranking” to mean both the artificial relevance ranking by statistical methods using various surrogates assumed to reliably indicate relevance and the real relevance as determined by the searcher. The paper indicates the misunderstandings arising from this terminological confusion and its consequences in the context of the invalid user models and artificial searches which accompany discussions of “relevance ranking”.

Research limitations/implications

Evaluations of relevance ranking must be based on real users and real searches. Theorising relevance as a judgement about information rather than a property of information clarifies many issues.

Practical implications

The design of search engines and OPACs will benefit from incorporating metadata that contain indications of user‐determined relevance.

Originality/value

The activity of subject analysis and indexing by human beings is presented as an activity identical in kind to the real searcher's determination of relevance, a definite statement of relevancy arising from a real communication situation rather than a statistically indicated probability.

Details

Online Information Review, vol. 31 no. 6
Type: Research Article
ISSN: 1468-4527

Keywords

To view the access options for this content please click here
Article
Publication date: 20 April 2015

Roslina Othman and Ashraf Ali Salahuddin

The purposes of this study were to measure the relevance status of Index Islamicus, evaluate the semantic correlation between a query and documents and inquire the basis…

Abstract

Purpose

The purposes of this study were to measure the relevance status of Index Islamicus, evaluate the semantic correlation between a query and documents and inquire the basis of its rank. Sorting the retrieved results from the most relevant to the least relevant is the common option of an information retrieval system. This sorting mechanism or relevance judgment is computed by measuring closeness of query with its documents.

Design/methodology/approach

Forming up 100 queries on Islamic History and Civilizations, with two indexing elements (keyword and concept), a laboratory experiment was generated on its first ten items of the rank. Throughout an experimental research design, the relevance status value formula was used to measure system-computed rank and compare it with mean average precision.

Findings

The results showed that the average status value of Index Islamicus’s ranking on relevance criterion was 18 per cent effective in terms of retrieving precise documents. Despite the main focus of this study being only on one subject domain and the items calculated were only 1,000, this small percentage of its ranking mechanism proved that semantic correlations between queries with subject domain did not achieve the satisfactory level.

Research limitations/implications

Implication of this study could be a guideline for further research on ranking mechanism of other search engines because the limitation of this study was Index Islamicus being the only database, which was the focus of this study.

Practical implications

Throughout this study, Index Islamicus would be benefited knowing the status of its ranking mechanism as well as other databases can make further research on their own ranking method following this study.

Social implications

Researchers and vendors of online databases can ensure their users a true platform of search engine with a proper ranking list.

Originality/value

Relevance status value model for Index Islamicus on Islamic History and Civilization that allows the system to rank documents according to the match between document and query and gives the idea of a better index. The model improves the system’s ranking mechanism, and promotes the use of semantic relationships. This research promotes the computation of relevance status value by domain for capturing subject-specific relevance criteria and semantic relationships.

Details

International Journal of Web Information Systems, vol. 11 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

To view the access options for this content please click here
Article
Publication date: 9 August 2011

Nadjla Hariri

The main purpose of this study is to evaluate the effectiveness of relevance ranking on Google by comparing the system's assessment of relevance with the users' views. The…

Downloads
3319

Abstract

Purpose

The main purpose of this study is to evaluate the effectiveness of relevance ranking on Google by comparing the system's assessment of relevance with the users' views. The research aims to find out whether the presumably objective relevance ranking of Google based on the PageRank and some other factors in fact matches users' subjective judgments of relevance.

Design/methodology/approach

This research investigated the relevance ranking of Google's retrieved results using 34 searches conducted by users in real search sessions. The results pages 1‐4 (i.e. the first 40 results) were examined by the users to identify relevant documents. Based on these data the frequency of relevant documents according to the appearance order of retrieved documents in the first four results pages was calculated. The four results pages were also compared in terms of precision.

Findings

In 50 per cent and 47.06 per cent of the searches the documents ranked 5th and 1st, (i.e. from the first pages of the retrieved results) respectively, were most relevant according to the users' viewpoints. Yet even in the fourth results pages there were three documents that were judged most relevant by the users in more than 40 per cent of the searches. There were no significant differences between the precision of the four results pages except between pages 1 and 3.

Practical implications

The results will help users of search engines, especially Google, to decide how many pages of the retrieved results to examine.

Originality/value

Search engine design will benefit from the results of this study as it experimentally evaluates the effectiveness of Google's relevance ranking.

Details

Online Information Review, vol. 35 no. 4
Type: Research Article
ISSN: 1468-4527

Keywords

To view the access options for this content please click here
Article
Publication date: 16 November 2015

Sri Devi Ravana, Prabha Rajagopal and Vimala Balakrishnan

In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment…

Abstract

Purpose

In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through human assessors is infeasible. Due to the large amount of documents that requires judgment, there are possible errors introduced by human assessors because of disagreements. The paper aims to discuss these issues.

Design/methodology/approach

This study explores exponential variation and document ranking methods that generate a reliable set of relevance judgments (pseudo relevance judgments) to reduce human efforts. These methods overcome problems with large amounts of documents for judgment while avoiding human disagreement errors during the judgment process. This study utilizes two key factors: number of occurrences of each document per topic from all the system runs; and document rankings to generate the alternate methods.

Findings

The effectiveness of the proposed method is evaluated using the correlation coefficient of ranked systems using mean average precision scores between the original Text REtrieval Conference (TREC) relevance judgments and pseudo relevance judgments. The results suggest that the proposed document ranking method with a pool depth of 100 could be a reliable alternative to reduce human effort and disagreement errors involved in generating TREC-like relevance judgments.

Originality/value

Simple methods proposed in this study show improvement in the correlation coefficient in generating alternate relevance judgment without human assessors while contributing to information retrieval evaluation.

Details

Aslib Journal of Information Management, vol. 67 no. 6
Type: Research Article
ISSN: 2050-3806

Keywords

To view the access options for this content please click here
Article
Publication date: 1 December 2005

Péter Jacsó

The purpose of this article is to look into relevance ranking and its importance in trying to bring some order to the deluge of results in response to a query.

Abstract

Purpose

The purpose of this article is to look into relevance ranking and its importance in trying to bring some order to the deluge of results in response to a query.

Design/methodology/approach

A large‐scale analysis of detailed web logs of various search engines was performed. Sample tests were made on five to eight versions of MEDLINE, ERIC, and PsycINFO on hosts which have comparable versions of the databases and offer relevance ranking.

Findings

It was found that, for fairness, it must be ensured that the implementations are identical, they have the same retrospective coverage, the same MEDLINE/PubMed subsets, and (quasi) identical update.

Research limitations/implications

The tests were made early September 2005. As databases are updated at different times, perfect synchronicity is not easy to achieve. When new records are added to the database, they may change the ranking of the test result set. Similarly, a small change in the fine‐tuning of the algorithm may yield different rank order positions of the same record the next time.

Originality/value

Brings together important research findings and suggests a topic for the next column.

Details

Online Information Review, vol. 29 no. 6
Type: Research Article
ISSN: 1468-4527

Keywords

To view the access options for this content please click here
Article
Publication date: 1 April 1999

Kendra Jones

Limitations of traditional Boolean searching are claimed to be overcome by two alternative search systems: DR‐LINK, a linguistic search system, and TARGET, a relevance

Abstract

Limitations of traditional Boolean searching are claimed to be overcome by two alternative search systems: DR‐LINK, a linguistic search system, and TARGET, a relevance ranking system. This paper compares the system and search features of both and describes conceptual differences in system design. A series of test questions was developed to test the retrieval effectiveness of both search systems. A controlled dataset was used to measure the results. System features are compared and discussed. Relevance overlap and search capabilities are evaluated and results are presented.

Details

Online and CD-Rom Review, vol. 23 no. 2
Type: Research Article
ISSN: 1353-2642

Keywords

To view the access options for this content please click here
Article
Publication date: 18 May 2021

Shah Khalid, Shengli Wu and Fang Zhang

How to provide the most useful papers for searchers is a key issue for academic search engines. A lot of research has been carried out to address this problem. However…

Abstract

Purpose

How to provide the most useful papers for searchers is a key issue for academic search engines. A lot of research has been carried out to address this problem. However, when evaluating the effectiveness of an academic search engine, most of the previous investigations assume that the only concern of the user is the relevancy of the paper to the query. The authors believe that the usefulness of a paper is determined not only by its relevance to the query but also by other aspects including its publication age and impact in the research community. This is vital, especially when a large number of papers are relevant to the query.

Design/methodology/approach

This paper proposes a group of metrics to measure the usefulness of a ranked list of papers. When defining these metrics, three factors, including relevance, publication age and impact, are considered at the same time. To accommodate this, the authors propose a framework to rank papers by a combination of their relevance, publication age and impact scores.

Findings

The framework is evaluated with the ACL (Association for Computational Linguistics Anthology Network) dataset. It demonstrates that the proposed ranking algorithm is effective for improving usefulness when two or three aspects of academic papers are considered at the same time, while the relevance of the retrieved papers is slightly down compared with the relevance-only retrieval.

Originality/value

To the best of the authors’ knowledge, the proposed multi-objective academic search framework is the first of its kind that is proposed and evaluated with a group of new evaluation metrics.

Details

Data Technologies and Applications, vol. 55 no. 5
Type: Research Article
ISSN: 2514-9288

Keywords

To view the access options for this content please click here
Article
Publication date: 11 October 2018

Prabha Rajagopal, Sri Devi Ravana, Yun Sing Koh and Vimala Balakrishnan

The effort in addition to relevance is a major factor for satisfaction and utility of the document to the actual user. The purpose of this paper is to propose a method in…

Downloads
2757

Abstract

Purpose

The effort in addition to relevance is a major factor for satisfaction and utility of the document to the actual user. The purpose of this paper is to propose a method in generating relevance judgments that incorporate effort without human judges’ involvement. Then the study determines the variation in system rankings due to low effort relevance judgment in evaluating retrieval systems at different depth of evaluation.

Design/methodology/approach

Effort-based relevance judgments are generated using a proposed boxplot approach for simple document features, HTML features and readability features. The boxplot approach is a simple yet repeatable approach in classifying documents’ effort while ensuring outlier scores do not skew the grading of the entire set of documents.

Findings

The retrieval systems evaluation using low effort relevance judgments has a stronger influence on shallow depth of evaluation compared to deeper depth. It is proved that difference in the system rankings is due to low effort documents and not the number of relevant documents.

Originality/value

Hence, it is crucial to evaluate retrieval systems at shallow depth using low effort relevance judgments.

Details

Aslib Journal of Information Management, vol. 71 no. 1
Type: Research Article
ISSN: 2050-3806

Keywords

To view the access options for this content please click here
Book part
Publication date: 10 February 2012

Ben Carterette, Evangelos Kanoulas and Emine Yilmaz

Purpose — The overall quality of an information retrieval system depends on many different aspects of the system and its users' information seeking behaviour, such as the…

Abstract

Purpose — The overall quality of an information retrieval system depends on many different aspects of the system and its users' information seeking behaviour, such as the speed of the system, the user interface, the query language and the features provided by the engine. One of the most important aspects is the effectiveness of the retrieval system, i.e. its ability to retrieve items that are relevant to the information need of an end user. This chapter focuses on methods for measuring effectiveness, in particular focusing on recent work that more directly models the utility of an engine to its users.

Methodology/approach — We discuss traditional approaches to effectiveness evaluation based on test collections, then transition to approaches based on test collections along with explicit models of user interaction with search results. We contrast this with approaches for which the user is ‘in the loop’, such as user studies and online evaluations.

Research limitations/implications — If it were possible to model users perfectly, we could directly estimate the utility of a search engine to its users; this would undoubtedly have a transformative effect on information retrieval and web search research. In practice, this goal will never be achievable because users exhibit far too much variability in how they approach the search engine, and furthermore provide valuable feedback that models and simulations cannot provide. Nevertheless, better models of user interaction will help develop better web search engines for a wider variety of tasks more rapidly.

Originality/value of paper — This is the first work that surveys recent work on user model-based evaluation and places it in a context with traditional evaluation based on the Cranfield paradigm.

Details

Web Search Engine Research
Type: Book
ISBN: 978-1-78052-636-2

Keywords

1 – 10 of over 21000