Search results

1 – 10 of 220
To view the access options for this content please click here
Article
Publication date: 1 January 1995

F. CRESTANI and C.J. VAN RIJSBERGEN

The evaluation of an implication by Imaging is a logical technique developed in the framework of modal logic. Its interpretation in the context of a ‘possible worlds 

Abstract

The evaluation of an implication by Imaging is a logical technique developed in the framework of modal logic. Its interpretation in the context of a ‘possible worlds’ semantics is very appealing for ir. In 1989, Van Rijsbergen suggested its use for solving one of the fundamental problems of logical models of IR: the evaluation of the implication d → q (where d and q are respectively a document and a query representation). Since then, others have tried to follow that suggestion proposing models and applications, though without much success. Most of these approaches had as their basic assumption the consideration that ‘a document is a possible world’. We propose instead an approach based on a completely different assumption: ‘a term is a possible world’. This approach enables the exploitation of term‐term relationships which are estimated using an information theoretic measure.

Details

Journal of Documentation, vol. 51 no. 1
Type: Research Article
ISSN: 0022-0418

To view the access options for this content please click here
Article
Publication date: 1 March 1984

ALAN GRIFFITHS, LESLEY A. ROBINSON and PETER WILLETT

This paper considers the classifications produced by application of the single linkage, complete linkage, group average and Ward clustering methods to the Keen and…

Abstract

This paper considers the classifications produced by application of the single linkage, complete linkage, group average and Ward clustering methods to the Keen and Cranfield document test collections. Experiments were carried out to study the structure of the hierarchies produced by the different methods, the extent to which the methods distort the input similarity matrices during the generation of a classification, and the retrieval effectiveness obtainable in cluster based retrieval. The results would suggest that the single linkage method, which has been used extensively in previous work on document clustering, is not the most effective procedure of those tested, although it should be emphasized that the experiments have used only small document test collections.

Details

Journal of Documentation, vol. 40 no. 3
Type: Research Article
ISSN: 0022-0418

To view the access options for this content please click here
Article
Publication date: 1 March 1978

D.J. HARPER and C.J. VAN RIJSBERGEN

This paper reports experiments with a term weighting model incorporating relevance information in which it is assumed that index terms are distributed dependently…

Abstract

This paper reports experiments with a term weighting model incorporating relevance information in which it is assumed that index terms are distributed dependently. Initially this model was tested with complete relevance information against a similar model which assumes index terms are distributed independently. The experiments demonstrated conclusively that index terms are not independent for a number of diverse document collections. It was concluded that the use of relevance information together with dependence information could potentially improve retrieval effectiveness. As a result of further experiments the initial strict dependence model was modified and in particular a new relevance‐based term weight was developed. This modified dependence model was then used as the basis for relevance feedback, i.e. with partial relevance information only, and significant increases in retrieval effectiveness were achieved. The evaluation method used in the feedback experiments emphasized the effect of the feedback on documents which the potential user would not previously have seen. Finally the incorporation of relevance feedback in an operational system is considered and in particular it is argued that if high recall searches are required, relevance feedback based on the modified dependence model may be superior to the widely used Boolean search.

Details

Journal of Documentation, vol. 34 no. 3
Type: Research Article
ISSN: 0022-0418

To view the access options for this content please click here
Article
Publication date: 1 February 1977

S.E. ROBERTSON

This paper is concerned with recent work in the theory of information retrieval. More particularly, it is concerned with theories which tackle the problem of retrieval…

Abstract

This paper is concerned with recent work in the theory of information retrieval. More particularly, it is concerned with theories which tackle the problem of retrieval performance, in a sense which will be explained. The aim is not an exhaustive survey of such work; rather it is an analysis and synthesis of those contributions which I feel to be important or find interesting.

Details

Journal of Documentation, vol. 33 no. 2
Type: Research Article
ISSN: 0022-0418

To view the access options for this content please click here
Article
Publication date: 1 March 1979

JOHN E. BURNETT, DAVID COOPER, MICHAEL F. LYNCH, PETER WILLETT and MAUREEN WYCHERLEY

A study has been made of the effect of controlled variations in indexing vocabulary size on retrieval performance using the Cranfield 200 and 1400 test collections. The…

Abstract

A study has been made of the effect of controlled variations in indexing vocabulary size on retrieval performance using the Cranfield 200 and 1400 test collections. The vocabularies considered are sets of variable‐length character strings chosen from the fronts of document and query terms so as to occur with approximate equifrequency. Sets containing between 120 and 720 members were tested both using an application of the Cluster Hypothesis and in a series of linear associative retrieval experiments. The effectiveness of the smaller sets is low but the larger ones exhibit retrieval characteristics comparable to those of words.

Details

Journal of Documentation, vol. 35 no. 3
Type: Research Article
ISSN: 0022-0418

To view the access options for this content please click here

Abstract

Details

Automated Information Retrieval: Theory and Methods
Type: Book
ISBN: 978-0-12266-170-9

To view the access options for this content please click here
Article
Publication date: 1 April 1974

KAREN SPARCK JONES

This article reviews the state of the art in automatic indexing, that is, automatic techniques for analysing and characterising documents, for manipulating their…

Abstract

This article reviews the state of the art in automatic indexing, that is, automatic techniques for analysing and characterising documents, for manipulating their descriptions in searching, and for generating the index language used for these purposes. It concentrates on the literature from 1968 to 1973. Section I defines the topic and its context. Sections II and III consider work in syntax and semantics respectively in detail. Section IV comments on ‘indirect’ indexing. Section V briefly surveys operating mechanized systems. In Section VI major experiments in automatic indexing are reviewed, and Section VII attempts an overall conclusion on the current state of automatic indexing techniques.

Details

Journal of Documentation, vol. 30 no. 4
Type: Research Article
ISSN: 0022-0418

To view the access options for this content please click here
Article
Publication date: 1 April 1976

C.J. VAN RIJSBERGEN

Items of information that have been stored in a computer normally need to be accessed via their contents. In principle this is always possible by doing an exhaustive scan…

Abstract

Items of information that have been stored in a computer normally need to be accessed via their contents. In principle this is always possible by doing an exhaustive scan of the entire file of information, but to achieve the access efficiently we use some sort of organizing principle, a file organization or file structure, to reduce the amount anning. Typically the items retrieved are a response to a request which fully or partially specifies their contents. Often the file organization requires pre‐processing of the body of information so that a secondary body of information (an index or directory) may be created which in some sense reveals the contents of the file. So, ultimately file structures are time saving devices, where we pay for the time saved by extra storage. They enable us quickly to find items of information by completely or partially specifying their contents.

Details

Journal of Documentation, vol. 32 no. 4
Type: Research Article
ISSN: 0022-0418

To view the access options for this content please click here
Article
Publication date: 1 March 1973

C.J. VAN RIJSBERGEN and K. SPARCK JONES

Many retrieval experiments are intended to discover ways of improving performance, taking the results obtained with some particular technique as a baseline. The fact that…

Abstract

Many retrieval experiments are intended to discover ways of improving performance, taking the results obtained with some particular technique as a baseline. The fact that substantial alterations to a system often have little or no effect on particular collections is puzzling. This may be due to the initially poor separation of relevant and non‐relevant documents. The paper presents a procedure for characterizing this separation for a collection, which can be used to show whether proposed modifications of the base system are likely to be useful.

Details

Journal of Documentation, vol. 29 no. 3
Type: Research Article
ISSN: 0022-0418

To view the access options for this content please click here
Article
Publication date: 1 March 1994

SÁNDOR DOMINICH

In existing information retrieval models there are three different ways documents are represented for retrieval purposes: vectors of weights, collections of sentences and…

Abstract

In existing information retrieval models there are three different ways documents are represented for retrieval purposes: vectors of weights, collections of sentences and artificial neurons. Accordingly, retrieval depends on a similarity function, or means an inference, or is a spreading of activation. Relevancy is considered to be a critical modelling parameter which is either a priori or it is not treated at all. Assuming that relevancy may equally be an emergent entity, thus not requiring any a priori modelling, the paper proposes the Interaction Information Retrieval model in which documents are interconnected, queries and documents are treated in the same way, and in which retrieval is the result of the interconnection between query and documents. Algorithms and experiences gained with practical applications are presented. A theoretical mathematical formulation of this type of retrieval is also given.

Details

Journal of Documentation, vol. 50 no. 3
Type: Research Article
ISSN: 0022-0418

1 – 10 of 220