To read this content please select one of the options below:

News analysis through text mining: a case study

I.C. Mogotsi (Department of Library and Information Studies, University of Botswana, Gaborone, Botswana)

VINE

ISSN: 0305-5728

Article publication date: 30 October 2007

1317

Abstract

Purpose

This paper seeks to provide a tangible example of the use of text‐mining techniques in a real world setting, i.e. using real, as opposed to test, data.

Design/methodology/approach

News stories are modeled using the vector space model, with the similarity between documents quantified using the cosine measure. For data analysis, three clustering algorithms are used, and the results from the best‐performing algorithm retained.

Findings

Agglomerative clustering performed poorly, while direct k‐way clustering and k‐way clustering through repeated bisections yielded similar results, with the former performing marginally better in terms of external isolation and internal cohesion of the clusters produced. A number of themes that dominated news coverage during the period under consideration were identified, some of which were noticeably only topical during certain parts of the year.

Research limitations/implications

Text mining holds much promise for businesses, particularly if integrated into a well‐orchestrated competitive intelligence function. However, more publicly accessible studies need to be undertaken if businesses are to derive maximum value from it.

Originality/value

There is a growing body of literature devoted to both data and text mining. However, much of this literature focuses on the development of new algorithms, with scant attention paid to the practical application of these techniques in business settings, possibly because of the strategic sensitivity of project findings. This study helps fill this yawning void.

Keywords

Citation

Mogotsi, I.C. (2007), "News analysis through text mining: a case study", VINE, Vol. 37 No. 4, pp. 516-531. https://doi.org/10.1108/03055720710838560

Publisher

:

Emerald Group Publishing Limited

Copyright © 2007, Emerald Group Publishing Limited

Related articles