To read this content please select one of the options below:

STOCHASTIC MODELS FOR THE DISTRIBUTION OF INDEX TERMS

MICHAEL J. NELSON (School of Library and Information Science, University of Western Ontario, London, Ontario, Canada, N6G 1H1)

Journal of Documentation

ISSN: 0022-0418

Article publication date: 1 March 1989

72

Abstract

Distributions of index terms have been used in modelling information retrieval systems and databases. Most previous models used some form of the Zipf distribution. This work uses a probability model of the occurrence of index terms to derive discrete distributions which are mixtures of Poisson and negative binomial distributions. These distributions, the generalised inverse Gaussian‐Poisson and the Generalised Waring give better fits than the simpler Zipf distribution, particularly in the tails of the distribution where the high frequency terms are found. They have the advantage of being more explanatory and can incorporate a time parameter if necessary.

Citation

NELSON, M.J. (1989), "STOCHASTIC MODELS FOR THE DISTRIBUTION OF INDEX TERMS", Journal of Documentation, Vol. 45 No. 3, pp. 227-237. https://doi.org/10.1108/eb026845

Publisher

:

MCB UP Ltd

Copyright © 1989, MCB UP Limited

Related articles