To read this content please select one of the options below:

Multi-word terms selection for information retrieval

Chedi Bechikh Ali (Institut National des Sciences Appliquées et de Technologie (INSAT), LISI, University of Carthage, Tunis, Tunisia)
Hatem Haddad (iCompass, Tunis, Tunisia)
Yahya Slimani (Institut Supérieur des Arts Multimédia (ISAMM), University of Manouba, Manouba, Tunisia)

Information Discovery and Delivery

ISSN: 2398-6247

Article publication date: 28 June 2022

Issue publication date: 6 January 2023

197

Abstract

Purpose

A number of approaches and algorithms have been proposed over the years as a basis for automatic indexing. Many of these approaches suffer from precision inefficiency at low recall. The choice of indexing units has a great impact on search system effectiveness. The authors dive beyond simple terms indexing to propose a framework for multi-word terms (MWT) filtering and indexing.

Design/methodology/approach

In this paper, the authors rely on ranking MWT to filter them, keeping the most effective ones for the indexing process. The proposed model is based on filtering MWT according to their ability to capture the document topic and distinguish between different documents from the same collection. The authors rely on the hypothesis that the best MWT are those that achieve the greatest association degree. The experiments are carried out with English and French languages data sets.

Findings

The results indicate that this approach achieved precision enhancements at low recall, and it performed better than more advanced models based on terms dependencies.

Originality/value

Using and testing different association measures to select MWT that best describe the documents to enhance the precision in the first retrieved documents.

Keywords

Acknowledgements

This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.

Citation

Bechikh Ali, C., Haddad, H. and Slimani, Y. (2023), "Multi-word terms selection for information retrieval", Information Discovery and Delivery, Vol. 51 No. 1, pp. 74-87. https://doi.org/10.1108/IDD-12-2021-0142

Publisher

:

Emerald Publishing Limited

Copyright © 2022, Emerald Publishing Limited

Related articles