To read this content please select one of the options below:

A tracking and summarization system for online Chinese news topics

Hsien-Tsung Chang (Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan, Taiwan)
Shu-Wei Liu (Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan, Taiwan)
Nilamadhab Mishra (Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan, Taiwan)

Aslib Journal of Information Management

ISSN: 2050-3806

Article publication date: 16 November 2015

700

Abstract

Purpose

The purpose of this paper is to design and implement new tracking and summarization algorithms for Chinese news content. Based on the proposed methods and algorithms, the authors extract the important sentences that are contained in topic stories and list those sentences according to timestamp order to ensure ease of understanding and to visualize multiple news stories on a single screen.

Design/methodology/approach

This paper encompasses an investigational approach that implements a new Dynamic Centroid Summarization algorithm in addition to a Term Frequency (TF)-Density algorithm to empirically compute three target parameters, i.e., recall, precision, and F-measure.

Findings

The proposed TF-Density algorithm is implemented and compared with the well-known algorithms Term Frequency-Inverse Word Frequency (TF-IWF) and Term Frequency-Inverse Document Frequency (TF-IDF). Three test data sets are configured from Chinese news web sites for use during the investigation, and two important findings are obtained that help the authors provide more precision and efficiency when recognizing the important words in the text. First, the authors evaluate three topic tracking algorithms, i.e., TF-Density, TF-IDF, and TF-IWF, with the said target parameters and find that the recall, precision, and F-measure of the proposed TF-Density algorithm is better than those of the TF-IWF and TF-IDF algorithms. In the context of the second finding, the authors implement a blind test approach to obtain the results of topic summarizations and find that the proposed Dynamic Centroid Summarization process can more accurately select topic sentences than the LexRank process.

Research limitations/implications

The results show that the tracking and summarization algorithms for news topics can provide more precise and convenient results for users tracking the news. The analysis and implications are limited to Chinese news content from Chinese news web sites such as Apple Library, UDN, and well-known portals like Yahoo and Google.

Originality/value

The research provides an empirical analysis of Chinese news content through the proposed TF-Density and Dynamic Centroid Summarization algorithms. It focusses on improving the means of summarizing a set of news stories to appear for browsing on a single screen and carries implications for innovative word measurements in practice.

Keywords

Acknowledgements

Financial support furnished by the Ministry of Science and Technology, Republic of China, through Grant MOST 103-2221-E-182-053 and 104-2221-E-182-069 of Chang Gung University is gratefully acknowledged.

Citation

Chang, H.-T., Liu, S.-W. and Mishra, N. (2015), "A tracking and summarization system for online Chinese news topics", Aslib Journal of Information Management, Vol. 67 No. 6, pp. 687-699. https://doi.org/10.1108/AJIM-10-2014-0147

Publisher

:

Emerald Group Publishing Limited

Copyright © 2015, Emerald Group Publishing Limited

Related articles