News analysis through text mining: a case study
Abstract
Purpose
This paper seeks to provide a tangible example of the use of text‐mining techniques in a real world setting, i.e. using real, as opposed to test, data.
Design/methodology/approach
News stories are modeled using the vector space model, with the similarity between documents quantified using the cosine measure. For data analysis, three clustering algorithms are used, and the results from the best‐performing algorithm retained.
Findings
Agglomerative clustering performed poorly, while direct k‐way clustering and k‐way clustering through repeated bisections yielded similar results, with the former performing marginally better in terms of external isolation and internal cohesion of the clusters produced. A number of themes that dominated news coverage during the period under consideration were identified, some of which were noticeably only topical during certain parts of the year.
Research limitations/implications
Text mining holds much promise for businesses, particularly if integrated into a well‐orchestrated competitive intelligence function. However, more publicly accessible studies need to be undertaken if businesses are to derive maximum value from it.
Originality/value
There is a growing body of literature devoted to both data and text mining. However, much of this literature focuses on the development of new algorithms, with scant attention paid to the practical application of these techniques in business settings, possibly because of the strategic sensitivity of project findings. This study helps fill this yawning void.
Keywords
Citation
Mogotsi, I.C. (2007), "News analysis through text mining: a case study", VINE, Vol. 37 No. 4, pp. 516-531. https://doi.org/10.1108/03055720710838560
Publisher
:Emerald Group Publishing Limited
Copyright © 2007, Emerald Group Publishing Limited