To read this content please select one of the options below:

Exogenous approach to improve topic segmentation

Marwa Naili (RIADI Laboratory, National School of Computer Science, University of Manouba, Tunisia)
Anja Habacha Chaibi (RIADI Laboratory, National School of Computer Science, University of Manouba, Tunisia)
Henda Hajjami Ben Ghezala (RIADI Laboratory, National School of Computer Science, University of Manouba, Tunisia)

International Journal of Intelligent Computing and Cybernetics

ISSN: 1756-378X

Article publication date: 13 June 2016

172

Abstract

Purpose

Topic segmentation is one of the active research fields in natural language processing. Also, many topic segmenters have been proposed. However, the current challenge of researchers is the improvement of these segmenters by using external resources. Therefore, the purpose of this paper is to integrate study and evaluate a new external semantic resource in topic segmentation.

Design/methodology/approach

New topic segmenters (TSS-Onto and TSB-Onto) are proposed based on the two well-known segmenters C99 and TextTiling. The proposed segmenters integrate semantic knowledge to the segmentation process by using a domain ontology as an external resource. Subsequently, an evaluation is made to study the effect of this resource on the quality of topic segmentation along with a comparative study with related works.

Findings

Based on this study, the authors showed that adding semantic knowledge, which is extracted from a domain ontology, improves the quality of topic segmentation. Moreover, TSS-Ont outperforms TSB-Ont in terms of quality of topic segmentation.

Research limitations/implications

The main limitation of this study is the used test corpus for the evaluation which is not a benchmark. However, we used a collection of scientific papers from well-known digital libraries (ArXiv and ACM).

Practical implications

The proposed topic segmenters can be useful in different NLP applications such as information retrieval and text summarizing.

Originality/value

The primary original contribution of this paper is the improvement of topic segmentation based on semantic knowledge. This knowledge is extracted from an ontological external resource.

Keywords

Citation

Naili, M., Habacha Chaibi, A. and Hajjami Ben Ghezala, H. (2016), "Exogenous approach to improve topic segmentation", International Journal of Intelligent Computing and Cybernetics, Vol. 9 No. 2, pp. 165-178. https://doi.org/10.1108/IJICC-01-2016-0001

Publisher

:

Emerald Group Publishing Limited

Copyright © 2016, Emerald Group Publishing Limited

Related articles