Towards text copyright detection using metadata in web applications
Program: electronic library and information systems
ISSN: 0033-0337
Article publication date: 27 September 2011
Abstract
Purpose
This paper aims to present the semantic content identifier (SCI), a permanent identifier, computed through a linear‐time onion‐peeling algorithm that enables the extraction of semantic features from a text, and the integration of this information within the permanent identifier.
Design/methodology/approach
The authors employ SCI to propose a mechanism for simultaneously checking the authenticity and degrees of similarity between different information objects, and present an empirical investigation of the method. A management scenario for the control of the authentication process and the detection of the degree of violation of documents is proposed.
Findings
Such a mechanism could be adopted as a component of libraries' strategy for the protection of the copyrights for documents published on the web.
Practical implications
The use of the proposed numeric code can be utilised efficiently as a constituent part of the digital object identifier (DOI) system, making its computation more efficient and meaningful.
Originality/value
The identifier proposed in the paper can result in a more efficient index for identifying and retrieving objects in a digital library, as well as online repositories and commercial applications that can handle information retrieval requests more effectively.
Keywords
Citation
Poulos, M., Korfiatis, N. and Bokos, G. (2011), "Towards text copyright detection using metadata in web applications", Program: electronic library and information systems, Vol. 45 No. 4, pp. 439-451. https://doi.org/10.1108/00330331111182111
Publisher
:Emerald Group Publishing Limited
Copyright © 2011, Emerald Group Publishing Limited