To read this content please select one of the options below:

The role of news title for linking during preservation process in digital archives

Muzammil Khan (Department of Computer and Software Technology, University of Swat, Swat, Pakistan)
Sarwar Shah Khan (School of Information Engineering, Zhengzhou University, Zhengzhou, China)
Arshad Ahmad (Department of IT and Computer Science, Pak-Austria Fachhochschule, Haripur, Pakistan) (Institute of Applied Sciences and Technology, Haripur, Pakistan)
Arif Ur Rahman (Department of Computer Science, Bahria University, Islamabad, Pakistan)

Library Hi Tech

ISSN: 0737-8831

Article publication date: 10 November 2020

Issue publication date: 22 November 2022

433

Abstract

Purpose

The World Wide Web has become an essential platform for a news publication, and it has become one of the primary sources of information dissemination in the past few years. Electronic media, i.e., television channels, magazines and newspapers, have started publishing news online. This online information is prompt to be disappeared because of short life-span and imperative to be archived for the long-term and future generations. This paper presents a content-based similarity measure based on the headings of the news articles for linking digital news stories published in various newspapers during the preservation process that helps to ensure future accessibility.

Design/methodology/approach

To evaluate the accuracy and assess the effectiveness and worth of the proposed measure for linking news articles in Digital News Story Archive (DNSA), we adopted both, system-centric and user-centric (human judgment) evaluation over different datasets of news articles.

Findings

The proposed similarity measure is evaluated using different sizes of datasets, and the results are compared by both user-centric technique, i.e., expert judgment and system-centric techniques, i.e., cosine similarity measure, extended Jaccard measure and common ratio measure for stories (CRMS). The comparison helps to get a broader impact and can be helpful for generalization of the measure for different categories of news articles. Multiple experiments have conducted the findings of which showed that the measure presented viable results for national and international news, while best results for linking sports news articles during preservation based on headings.

Originality/value

The DNSA preserves a huge number of news articles from multiple news sources and to link with a vast collection, which encourages to introduce an efficient linking mechanism with few terms to manipulate. The CRMS is modified to deal with the headings of news articles as a part of the digital news stories preservation framework and comprehensively analysed.

Keywords

Citation

Khan, M., Khan, S.S., Ahmad, A. and Rahman, A.U. (2022), "The role of news title for linking during preservation process in digital archives", Library Hi Tech, Vol. 40 No. 5, pp. 1359-1383. https://doi.org/10.1108/LHT-07-2020-0157

Publisher

:

Emerald Publishing Limited

Copyright © 2020, Emerald Publishing Limited

Related articles