The purpose of this paper is to propose a framework for describing and evaluating the representativeness of a small set of search results extracted from the original results: this is deemed desirable in information retrieval in enterprise information systems.
The paper proposes a combined measure, namely RFβ, to evaluate the extracted small set in terms of the notions of coverage and redundancy. Data experiments were conducted on three different extraction strategies to evaluate the representativeness, i.e. coverage and redundancy.
Both from intuitive and experimental perspectives, the proposed coverage measure, redundancy measure and RFβ measure could effectively evaluate the representativeness.
The search results, e.g. in the form of documents and texts, are modeled using a vector space model and cosine similarity. Semantic models and linguistic models could be further introduced into this research to improve the proposed measures.
With the rapidly growing need for information retrieval in enterprise information systems, the representativeness of search results become more desirable and important for search engine users. The well‐designed representativeness measures will help them achieve satisfactory results.
The originality of the paper lies in the definition of representativeness of a small set of search results extracted from the original results. This focuses on the two aspects of coverage rate and redundancy rate both from intuitive and experimental perspectives.
Ma, B., Wei, Q. and Chen, G. (2011), "A combined measure for representative information retrieval in enterprise information systems", Journal of Enterprise Information Management, Vol. 24 No. 4, pp. 310-321. https://doi.org/10.1108/17410391111148567
Emerald Group Publishing Limited
Copyright © 2011, Emerald Group Publishing Limited