To read this content please select one of the options below:

Latent Dirichlet allocation-based temporal summarization

Ahmed Amir Tazibt (Department of Computer Science, Mouloud Mammeri University of Tizi Ouzou, Tizi Ouzou, Algeria)
Farida Aoughlis (Department of Computer Science, Mouloud Mammeri University of Tizi Ouzou, Tizi Ouzou, Algeria)

International Journal of Web Information Systems

ISSN: 1744-0084

Article publication date: 21 November 2018

Issue publication date: 7 March 2019

196

Abstract

Purpose

During crises such as accidents or disasters, an enormous volume of information is generated on the Web. Both people and decision-makers often need to identify relevant and timely content that can help in understanding what happens and take right decisions, as soon it appears online. However, relevant content can be disseminated in document streams. The available information can also contain redundant content published by different sources. Therefore, the need of automatic construction of summaries that aggregate important, non-redundant and non-outdated pieces of information is becoming critical.

Design/methodology/approach

The aim of this paper is to present a new temporal summarization approach based on a popular topic model in the information retrieval field, the Latent Dirichlet Allocation. The approach consists of filtering documents over streams, extracting relevant parts of information and then using topic modeling to reveal their underlying aspects to extract the most relevant and novel pieces of information to be added to the summary.

Findings

The performance evaluation of the proposed temporal summarization approach based on Latent Dirichlet Allocation, performed on the TREC Temporal Summarization 2014 framework, clearly demonstrates its effectiveness to provide short and precise summaries of events.

Originality/value

Unlike most of the state of the art approaches, the proposed method determines the importance of the pieces of information to be added to the summaries solely relying on their representation in the topic space provided by Latent Dirichlet Allocation, without the use of any external source of evidence.

Keywords

Acknowledgements

The authors offer their sincerest gratitude to M. Boughanem for his help, patience and precious advice. They would also like to thank the members of the IRIS team at IRIT as well as the OSIRIM staff for their hospitality and support during their stay in their laboratory. In addition, the authors thank the Algerian Ministry of Higher Education and Scientific Research for financial support for seven months under the PNE fellowship program. The experiments presented in this paper were carried out using the OSIRIM platform that is administered by IRIT and supported by CNRS, the Region Midi-Pyrénées, the French Government and ERDF (see http://osirim.irit.fr/site/en).

Citation

Tazibt, A.A. and Aoughlis, F. (2019), "Latent Dirichlet allocation-based temporal summarization", International Journal of Web Information Systems, Vol. 15 No. 1, pp. 83-102. https://doi.org/10.1108/IJWIS-04-2018-0023

Publisher

:

Emerald Publishing Limited

Copyright © 2018, Emerald Publishing Limited

Related articles