Search results

1 – 10 of 296
Article
Publication date: 7 October 2014

Ian Ruthven

The purpose of this paper is to examine how various types of TREC data can be used to better understand relevance and serve as test-bed for exploring relevance. The author…

Abstract

Purpose

The purpose of this paper is to examine how various types of TREC data can be used to better understand relevance and serve as test-bed for exploring relevance. The author proposes that there are many interesting studies that can be performed on the TREC data collections that are not directly related to evaluating systems but to learning more about human judgements of information and relevance and that these studies can provide useful research questions for other types of investigation.

Design/methodology/approach

Through several case studies the author shows how existing data from TREC can be used to learn more about the factors that may affect relevance judgements and interactive search decisions and answer new research questions for exploring relevance.

Findings

The paper uncovers factors, such as familiarity, interest and strictness of relevance criteria, that affect the nature of relevance assessments within TREC, contrasting these against findings from user studies of relevance.

Research limitations/implications

The research only considers certain uses of TREC data and assessment given by professional relevance assessors but motivates further exploration of the TREC data so that the research community can further exploit the effort involved in the construction of TREC test collections.

Originality/value

The paper presents an original viewpoint on relevance investigations and TREC itself by motivating TREC as a source of inspiration on understanding relevance rather than purely as a source of evaluation material.

Details

Journal of Documentation, vol. 70 no. 6
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 16 November 2015

Sri Devi Ravana, Prabha Rajagopal and Vimala Balakrishnan

In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through…

1359

Abstract

Purpose

In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through human assessors is infeasible. Due to the large amount of documents that requires judgment, there are possible errors introduced by human assessors because of disagreements. The paper aims to discuss these issues.

Design/methodology/approach

This study explores exponential variation and document ranking methods that generate a reliable set of relevance judgments (pseudo relevance judgments) to reduce human efforts. These methods overcome problems with large amounts of documents for judgment while avoiding human disagreement errors during the judgment process. This study utilizes two key factors: number of occurrences of each document per topic from all the system runs; and document rankings to generate the alternate methods.

Findings

The effectiveness of the proposed method is evaluated using the correlation coefficient of ranked systems using mean average precision scores between the original Text REtrieval Conference (TREC) relevance judgments and pseudo relevance judgments. The results suggest that the proposed document ranking method with a pool depth of 100 could be a reliable alternative to reduce human effort and disagreement errors involved in generating TREC-like relevance judgments.

Originality/value

Simple methods proposed in this study show improvement in the correlation coefficient in generating alternate relevance judgment without human assessors while contributing to information retrieval evaluation.

Details

Aslib Journal of Information Management, vol. 67 no. 6
Type: Research Article
ISSN: 2050-3806

Keywords

Article
Publication date: 1 March 1997

S.E. Robertson, S. Walker and M. Beaulieu

A brief review of the history of laboratory testing of information retrieval systems focuses on the idea of a general‐purpose test collection of documents, queries and relevance…

Abstract

A brief review of the history of laboratory testing of information retrieval systems focuses on the idea of a general‐purpose test collection of documents, queries and relevance judgements. The TREC programme is introduced in this context, and an overview is given of the methods used in TREC. The Okapi team’s participation in TREC is then discussed. The team has made use of TREC to improve some of the automatic techniques used in Okapi, specifically the term weighting function and the algorithms for term selection for query expansion. The consequence of this process has been a very good showing for Okapi in terms of the TREC evaluation results. Some of the issues around the much more difficult problem of interactive evaluation in TREC are then discussed. Although some interesting interactive experiments have been performed at TREC, the problems of reconciling the requirements of the laboratory context with the concerns of interactive retrieval are still largely unresolved.

Details

Journal of Documentation, vol. 53 no. 1
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 29 November 2011

Na Dai and Brian D. Davison

This work aims to investigate the sensitivity of ranking performance with respect to the topic distribution of queries selected for ranking evaluation.

Abstract

Purpose

This work aims to investigate the sensitivity of ranking performance with respect to the topic distribution of queries selected for ranking evaluation.

Design/methodology/approach

The authors reweight queries used in two TREC tasks to make them match three real background topic distributions, and show that the performance rankings of retrieval systems are quite different.

Findings

It is found that search engines tend to perform similarly on queries about the same topic; and search engine performance is sensitive to the topic distribution of queries used in evaluation.

Originality/value

Using experiments with multiple real‐world query logs, the paper demonstrates weaknesses in the current evaluation model of retrieval systems.

Article
Publication date: 1 June 2000

Stephen Robertson and Stephen Walker

A major problem in using current best‐match methods in a filtering task is that of setting appropriate thresholds, which are required in order to force a binary decision on…

Abstract

A major problem in using current best‐match methods in a filtering task is that of setting appropriate thresholds, which are required in order to force a binary decision on notifying a user of a document. We discuss methods for setting such thresholds and adapting them as a result of feedback information on the performance of the profile. These methods fit within the probabilistic approach to retrieval, and are applied to a probabilistic system. Some experiments, within the framework of the TREC‐7 adaptive filtering track, are described.

Details

Journal of Documentation, vol. 56 no. 3
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 31 July 2007

Ian Ruthven, Mark Baillie and David Elsweiler

The purpose of this paper is to examine how different aspects of an assessor's context, in particular their knowledge of a search topic, their interest in the search topic and…

Abstract

Purpose

The purpose of this paper is to examine how different aspects of an assessor's context, in particular their knowledge of a search topic, their interest in the search topic and their confidence in assessing relevance for a topic, affect the relevance judgements made and the assessor's ability to predict which documents they will assess as being relevant.

Design/methodology/approach

The study was conducted as part of the Text REtrieval Conference (TREC) HARD track. Using a specially constructed questionnaire information was sought on TREC assessors' personal context and, using the TREC assessments gathered, the responses were correlated to the questionnaire questions and the final relevance decisions.

Findings

This study found that each of the three factors (interest, knowledge and confidence) had an affect on how many documents were assessed as relevant and the balance between how many documents were marked as marginally or highly relevant. Also these factors are shown to affect an assessors' ability to predict what information they will finally mark as being relevant.

Research limitations/implications

The major limitation is that the research is conducted within the TREC initiative. This means that we can report on results but cannot report on discussions with the assessors. The research implications are numerous but mainly on the effect of personal context on the outcomes of a user study.

Practical implications

One major consequence is that we should take more account of how we construct search tasks for IIR evaluation to create tasks that are interesting and relevant to experimental subjects.

Originality/value

Examining different search variables within one study to compare the relative effects on these variables on the search outcomes.

Details

Journal of Documentation, vol. 63 no. 4
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 8 March 2011

Heting Chu

This study intends to identify factors that affect relevance judgment of retrieved information as part of the 2007 TREC Legal track interactive task.

Abstract

Purpose

This study intends to identify factors that affect relevance judgment of retrieved information as part of the 2007 TREC Legal track interactive task.

Design/methodology/approach

Data were gathered and analyzed from the participants of the 2007 TREC Legal track interactive task using a questionnaire which includes not only a list of 80 relevance factors identified in prior research, but also a space for expressing their thoughts on relevance judgment in the process.

Findings

This study finds that topicality remains a primary criterion, out of various options, for determining relevance, while specificity of the search request, task, or retrieved results also helps greatly in relevance judgment.

Research limitations/implications

Relevance research should focus on the topicality and specificity of what is being evaluated as well as conducted in real environments.

Practical implications

If multiple relevance factors are presented to assessors, the total number in a list should be below ten to take account of the limited processing capacity of human beings' short‐term memory. Otherwise, the assessors might either completely ignore or inadequately consider some of the relevance factors when making judgment decisions.

Originality/value

This study presents a method for reducing the artificiality of relevance research design, an apparent limitation in many related studies. Specifically, relevance judgment was made in this research as part of the 2007 TREC Legal track interactive task rather than a study devised for the sake of it. The assessors also served as searchers so that their searching experience would facilitate their subsequent relevance judgments.

Details

Journal of Documentation, vol. 67 no. 2
Type: Research Article
ISSN: 0022-0418

Keywords

Content available
Article
Publication date: 2 October 2007

Gobinda Chowdhury

608

Abstract

Details

Online Information Review, vol. 31 no. 5
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 27 July 2010

A. MacFarlane, A. Secker, P. May and J. Timmis

The term selection problem for selecting query terms in information filtering and routing has been investigated using hill‐climbers of various kinds, largely through the Okapi…

Abstract

Purpose

The term selection problem for selecting query terms in information filtering and routing has been investigated using hill‐climbers of various kinds, largely through the Okapi experiments in the TREC series of conferences. Although these are simple deterministic approaches, which examine the effect of changing the weight of one term at a time, they have been shown to improve the retrieval effectiveness of filtering queries in these TREC experiments. Hill‐climbers are, however, likely to get trapped in local optima, and the use of more sophisticated local search techniques for this problem that attempt to break out of these optima are worth investigating. To this end, this paper aims to apply a genetic algorithm (GA) to the same problem.

Design/methodology/approach

A standard TREC test collection is used from the TREC‐8 filtering track, recording mean average precision and recall measures to allow comparison between the hill‐climber and GAs. It also varies elements of the GA, such as probability of a word being included, probability of mutation and population size in order to measure the effect of these variables. Different strategies such as elitist and non‐elitist methods are used, as well as roulette wheel and rank selection GAs.

Findings

The results of tests suggest that both techniques are, on average, better than the baseline, but, the implemented GA does not match the overall performance of a hill‐climber. The Rank selection algorithm does better on average than the Roulette Wheel algorithm. There is no evidence in this study that varying word inclusion probability, mutation probability or Elitist method make much difference to the overall results. Small population sizes do not appear to be as effective as larger population sizes.

Research limitations/implications

The evidence provided here would suggest that being stuck in a local optima for the term selection optimization problem does not appear to be detrimental to the overall success of the hill‐climber. The evidence from term rank order would appear to provide extra useful evidence, which hill climbers can use efficiently, and effectively, to narrow the search space.

Originality/value

The paper represents the first attempt to compare hill‐climbers with GAs on a problem of this type.

Details

Journal of Documentation, vol. 66 no. 4
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 1 February 2000

Pia Borlund

This paper presents a set of basic components which constitutes the experimental setting intended for the evaluation of interactive information retrieval (IIR) systems, the aim of…

1986

Abstract

This paper presents a set of basic components which constitutes the experimental setting intended for the evaluation of interactive information retrieval (IIR) systems, the aim of which is to facilitate evaluation of IIR systems in a way which is as close as possible to realistic IR processes. The experimental setting consists of three components: (1) the involvement of potential users as test persons; (2) the application of dynamic and individual information needs; and (3) the use of multidimensional and dynamic relevance judgements. Hidden under the information need component is the essential central sub‐component, the simulated work task situation, the tool that triggers the (simulated) dynamic information needs. This paper also reports on the empirical findings of the metaevaluation of the application of this sub‐component, the purpose of which is to discover whether the application of simulated work task situations to future evaluation of IIR systems can be recommended. Investigations are carried out to determine whether any search behavioural differences exist between test persons‘ treatment of their own real information needs versus simulated information needs. The hypothesis is that if no difference exists one can correctly substitute real information needs with simulated information needs through the application of simulated work task situations. The empirical results of the meta‐evaluation provide positive evidence for the application of simulated work task situations to the evaluation of IIR systems. The results also indicate that tailoring work task situations to the group of test persons is important in motivating them. Furthermore, the results of the evaluation show that different versions of semantic openness of the simulated situations make no difference to the test persons’ search treatment.

Details

Journal of Documentation, vol. 56 no. 1
Type: Research Article
ISSN: 0022-0418

Keywords

1 – 10 of 296