Exogenous approach to improve topic segmentation
International Journal of Intelligent Computing and Cybernetics
ISSN: 1756-378X
Article publication date: 13 June 2016
Abstract
Purpose
Topic segmentation is one of the active research fields in natural language processing. Also, many topic segmenters have been proposed. However, the current challenge of researchers is the improvement of these segmenters by using external resources. Therefore, the purpose of this paper is to integrate study and evaluate a new external semantic resource in topic segmentation.
Design/methodology/approach
New topic segmenters (TSS-Onto and TSB-Onto) are proposed based on the two well-known segmenters C99 and TextTiling. The proposed segmenters integrate semantic knowledge to the segmentation process by using a domain ontology as an external resource. Subsequently, an evaluation is made to study the effect of this resource on the quality of topic segmentation along with a comparative study with related works.
Findings
Based on this study, the authors showed that adding semantic knowledge, which is extracted from a domain ontology, improves the quality of topic segmentation. Moreover, TSS-Ont outperforms TSB-Ont in terms of quality of topic segmentation.
Research limitations/implications
The main limitation of this study is the used test corpus for the evaluation which is not a benchmark. However, we used a collection of scientific papers from well-known digital libraries (ArXiv and ACM).
Practical implications
The proposed topic segmenters can be useful in different NLP applications such as information retrieval and text summarizing.
Originality/value
The primary original contribution of this paper is the improvement of topic segmentation based on semantic knowledge. This knowledge is extracted from an ontological external resource.
Keywords
Citation
Naili, M., Habacha Chaibi, A. and Hajjami Ben Ghezala, H. (2016), "Exogenous approach to improve topic segmentation", International Journal of Intelligent Computing and Cybernetics, Vol. 9 No. 2, pp. 165-178. https://doi.org/10.1108/IJICC-01-2016-0001
Publisher
:Emerald Group Publishing Limited
Copyright © 2016, Emerald Group Publishing Limited