The suggestion that classifications for retrieval should be constructed automatically raises some serious problems concerning the sorts of classification which are required, and the way in which formal classification theories should be exploited, given that a retrieval classification is required for a purpose. These difficulties have not been sufficiently considered, and the paper therefore attempts an analysis of them, though no solutions of immediate application can be suggested. Starting with the illustrative proposition that a polythetic, multiple, unordered classification is required in automatic thesaurus construction, this is considered in the context of classification in general, where eight sorts of classification can be distinguished, each covering a range of class definitions and class‐finding algorithms. The problem which follows is that since there is generally no natural or best classification of a set of objects as such, the evaluation of alternative classifications requires cither formal criteria of goodness of fit, or, if a classification is required for a purpose, a precise statement of that purpose. In any case a substantive theory of classification is needed, which does not exist; and since sufficiently precise specifications of retrieval requirements are also lacking, the only currently available approach to automatic classification experiments for information retrieval is to do enough of them.
CitationDownload as .RIS
MCB UP Ltd
Copyright © 1970, MCB UP Limited