Search results

1 – 10 of 48
Content available
Article
Publication date: 11 September 2007

Peter Willett and Stephen Robertson

567

Abstract

Details

Journal of Documentation, vol. 63 no. 5
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 February 1972

B.C. BROOKES

This note was evoked by the reference by Karen Sparck Jones to a paper by Zunde and Slamecka which has recently been reprinted in Introduction to Information Science, edited by…

Abstract

This note was evoked by the reference by Karen Sparck Jones to a paper by Zunde and Slamecka which has recently been reprinted in Introduction to Information Science, edited by Saracevic. Zunde and Slamecka purport to show that, for optimum performance of IR systems, the frequency distribution of descriptor terms should conform with a geometric progression. This result is at variance with the widely accepted result derived from the Shannon model which shows that optimum performance of an IR system occurs when the descriptor terms are equi‐probable, i.e. when their frequency distribution is uniform. The uncertainty arising from these two different solutions to the same problem clearly led Karen Sparck Jones to have some reservations about the theoretical justification for her interesting idea of weighting search terms to give them, in effect, the equal weights that the usual Shannon result demands for optimum performance. But Sparck Jones need have no such reservations. The result obtained by Zunde and Slamecka, though plausible because it has some fortuitous semblance to the distributions of terms found in real systems, is in fact erroneous.

Details

Journal of Documentation, vol. 28 no. 2
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 April 1974

KAREN SPARCK JONES

This article reviews the state of the art in automatic indexing, that is, automatic techniques for analysing and characterising documents, for manipulating their descriptions in…

Abstract

This article reviews the state of the art in automatic indexing, that is, automatic techniques for analysing and characterising documents, for manipulating their descriptions in searching, and for generating the index language used for these purposes. It concentrates on the literature from 1968 to 1973. Section I defines the topic and its context. Sections II and III consider work in syntax and semantics respectively in detail. Section IV comments on ‘indirect’ indexing. Section V briefly surveys operating mechanized systems. In Section VI major experiments in automatic indexing are reviewed, and Section VII attempts an overall conclusion on the current state of automatic indexing techniques.

Details

Journal of Documentation, vol. 30 no. 4
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 October 2005

Birger Hjørland and Karsten Nissen Pedersen

To suggest that a theory of classification for information retrieval (IR), asked for by Spärck Jones in a 1970 paper, presupposes a full implementation of a pragmatic…

3412

Abstract

Purpose

To suggest that a theory of classification for information retrieval (IR), asked for by Spärck Jones in a 1970 paper, presupposes a full implementation of a pragmatic understanding. Part of the Journal of Documentation celebration, “60 years of the best in information research”.

Design/methodology/approach

Literature‐based conceptual analysis, taking Spärck Jones as its starting‐point. Analysis involves distinctions between “positivism” and “pragmatism” and “classical” versus Kuhnian understandings of concepts.

Findings

Classification, both manual and automatic, for retrieval benefits from drawing upon a combination of qualitative and quantitative techniques, a consideration of theories of meaning, and the adding of top‐down approaches to IR in which divisions of labour, domains, traditions, genres, document architectures etc. are included as analytical elements and in which specific IR algorithms are based on the examination of specific literatures. Introduces an example illustrating the consequences of a full implementation of a pragmatist understanding when handling homonyms.

Practical implications

Outlines how to classify from a pragmatic‐philosophical point of view.

Originality/value

Provides, emphasizing a pragmatic understanding, insights of importance to classification for retrieval, both manual and automatic.

Details

Journal of Documentation, vol. 61 no. 5
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 1 October 2005

This article has been withdrawn as it was published elsewhere and accidentally duplicated. The original article can be seen here: 10.1108/eb026488. When citing the article, please…

2280

Abstract

This article has been withdrawn as it was published elsewhere and accidentally duplicated. The original article can be seen here: 10.1108/eb026488. When citing the article, please cite: KAREN SPARCK JONES, (1970), “SOME THOUGHTS ON CLASSIFICATION FOR RETRIEVAL”, Journal of Documentation, Vol. 26 Iss: 2, pp. 89 - 101.

Details

Journal of Documentation, vol. 61 no. 5
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 January 1979

KAREN SPARCK JONES

Previous experiments demonstrated the value of relevance weighting for search terms, but relied on substantial relevance information for the terms. The present experiments were…

Abstract

Previous experiments demonstrated the value of relevance weighting for search terms, but relied on substantial relevance information for the terms. The present experiments were designed to study the effects of weights based on very limited relevance information, for example supplied by one or two relevant documents. The tests simulated iterative searching, as in an on‐line system, and show that even very little relevance information can be of considerable value.

Details

Journal of Documentation, vol. 35 no. 1
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 October 2005

Karen Spärck Jones

This short note seeks to respond to Hjørland and Pederson's paper “A substantive theory of classification for information retrieval” which starts from Spärck Jones's, “Some…

1409

Abstract

Purpose

This short note seeks to respond to Hjørland and Pederson's paper “A substantive theory of classification for information retrieval” which starts from Spärck Jones's, “Some thoughts on classification for retrieval”, originally published in 1970.

Design/methodology/approach

The note comments on the context in which the 1970 paper was written, and on Hjørland and Pedersen's views, emphasising the need for well‐grounded classification theory and application.

Findings

The note maintains that text‐based, a posteriori, classification, as increasingly found in applications, is likely to be more useful, in general, than a priori classification.

Originality/value

The note elaborates on points made in a well‐received earlier paper.

Details

Journal of Documentation, vol. 61 no. 5
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 1 April 1975

KAREN SPARCK JONES

It would be very helpful in retrieval experiments if good retrieval performance for a test collection was known, so that performance for particular devices could be fully…

Abstract

It would be very helpful in retrieval experiments if good retrieval performance for a test collection was known, so that performance for particular devices could be fully evaluated. This paper presents one performance yardstick, based on optimally weighted request terms, and illustrates its application to different test collections.

Details

Journal of Documentation, vol. 31 no. 4
Type: Research Article
ISSN: 0022-0418

Content available
Article
Publication date: 1 November 2006

David Bawden

441

Abstract

Details

Journal of Documentation, vol. 62 no. 6
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 March 1978

D.J. HARPER and C.J. VAN RIJSBERGEN

This paper reports experiments with a term weighting model incorporating relevance information in which it is assumed that index terms are distributed dependently. Initially this…

Abstract

This paper reports experiments with a term weighting model incorporating relevance information in which it is assumed that index terms are distributed dependently. Initially this model was tested with complete relevance information against a similar model which assumes index terms are distributed independently. The experiments demonstrated conclusively that index terms are not independent for a number of diverse document collections. It was concluded that the use of relevance information together with dependence information could potentially improve retrieval effectiveness. As a result of further experiments the initial strict dependence model was modified and in particular a new relevance‐based term weight was developed. This modified dependence model was then used as the basis for relevance feedback, i.e. with partial relevance information only, and significant increases in retrieval effectiveness were achieved. The evaluation method used in the feedback experiments emphasized the effect of the feedback on documents which the potential user would not previously have seen. Finally the incorporation of relevance feedback in an operational system is considered and in particular it is argued that if high recall searches are required, relevance feedback based on the modified dependence model may be superior to the widely used Boolean search.

Details

Journal of Documentation, vol. 34 no. 3
Type: Research Article
ISSN: 0022-0418

1 – 10 of 48