Producing a bibliographic database through scanning and OCR: the Online Contents Project in the Royal Library of the Netherlands
Program: electronic library and information systems
ISSN: 0033-0337
Article publication date: 1 April 1992
Abstract
The Koninklijke Bibliotheek (national library of The Netherlands, KB), in a joint project with the Library of the Catholic University of Brabant in Tilburg (KUB) and Pica (Dutch Centre for Library Automation), is creating a database with tables of contents of current periodicals. The information is scanned from the table of contents of the original periodical issue and converted by an OCR program into an ASCII file. This file is manually corrected and marked with tags. Subsequently an input file is created for the conversion that strips off all noise and creates a structured record. This record is the input for the database that will be presented as part of the OPAC to the library users. Sub‐projects are the research into the possibilities for the creation of a distributed database for nationwide use and a comparison of scanning and OCR techniques with manual input. When the outcome of the project is satisfactory the Online Contents database will be made operational for the KB on a regular basis. There are also plans for a national database, with nationwide input.
Citation
Ongering, M. and Wesseling, M. (1992), "Producing a bibliographic database through scanning and OCR: the Online Contents Project in the Royal Library of the Netherlands", Program: electronic library and information systems, Vol. 26 No. 4, pp. 393-399. https://doi.org/10.1108/eb047127
Publisher
:MCB UP Ltd
Copyright © 1992, MCB UP Limited