A novel self‐organising clustering model for time‐event documents
Abstract
Purpose
The purpose of this paper is to examine neural document clustering techniques, e.g. self‐organising map (SOM) or growing neural gas (GNG), usually assume that textual information is stationary on the quantity.
Design/methodology/approach
The authors propose a novel dynamic adaptive self‐organising hybrid (DASH) model, which adapts to time‐event news collections not only to the neural topological structure but also to its main parameters in a non‐stationary environment. Based on features of a time‐event news collection in a non‐stationary environment, they review the main current neural clustering models. The main deficiency is a need of pre‐definition of the thresholds of unit‐growing and unit‐pruning. Thus, the dynamic adaptive self‐organising hybrid (DASH) model is designed for a non‐stationary environment.
Findings
The paper compares DASH with SOM and GNG based on an artificial jumping corner data set and a real world Reuters news collection. According to the experimental results, the DASH model is more effective than SOM and GNG for time‐event document clustering.
Practical implications
A real world environment is dynamic. This paper provides an approach to present news clustering in a non‐stationary environment.
Originality/value
Text clustering in a non‐stationary environment is a novel concept. The paper demonstrates DASH, which can deal with a real world data set in a non‐stationary environment.
Keywords
Citation
Hung, C. and Wermter, S. (2008), "A novel self‐organising clustering model for time‐event documents", The Electronic Library, Vol. 26 No. 2, pp. 260-272. https://doi.org/10.1108/02640470810864145
Publisher
:Emerald Group Publishing Limited
Copyright © 2008, Emerald Group Publishing Limited