Beyond TIFF and JPEG2000: PDF/A as an OAIS submission information package container
Article publication date: 21 September 2015
The purpose of this paper is to introduce PDF/A to replace TIFF as the preferred file format for digitization of textual documents. In addition, PDF/A can be used as an open archival information system (OAIS) submission information package (SIP) container to reduce digitization and digital preservation costs.
The author first reviewed the current digitization guidelines, the OAIS model and provides on an overview of the development PDF and PDF/A as international standards. Then literature review of the uses of PDF/A is presented. The author analyzed pitfalls of TIFFs as the preferred format for digitization, and showed how to use PDF/A to code digitization SIP.
TIFF file format has been the preferred master file format by Federal Agency Digitization Guidelines Initiative digitization guidelines for the past 20 years. However, there are drawbacks of TIFF format. Literature reviews show that PDF/A has been the preferred standard for coding born-digital documents in court, government and business sectors. PDF/A-2 and PDF/A-3 are relatively new standards released after 2010. However, few understood the standards and have utilized the full potentials in digitization. The author shows that PDF/A can be used as an OAIS SIP container.
In order to delivery OAIS SIPs, current practices require a combination of files, directories and various types of metadata. The author shows that PDF/A (PDF/A-2 and/or PDF/A-3) can be a better file format for textual document digitization with coding various types of metadata in extensible metadata platform and arbitrary file/data can be coded in PDF/A-3. These features in PDF/A provide much better ways to deliver SIPs in a cost-efficient manner.
PDF/A has been recognized as the preferred standard for born-digital documents, but it has not been used as the preferred file format for digitized materials. The author recommends that: PDF/A with lossless JPX compressions as the preferred file format; and PDF/A with lossless JPX compressions along with metadata/data as the preferred OAIS SIP container. As a result, the uses reduce costs in digitization and digital preservation and also increase productivity. The author recommends to update the national and international digitization practices using PDF/A.
The author would like to thank Leonard Rosenthol, Project leader for ISO PDF/A and Adobe PDF Architect for his comments on TIFF, PDF and PDF/A file formats.
Han, Y. (2015), "Beyond TIFF and JPEG2000: PDF/A as an OAIS submission information package container", Library Hi Tech, Vol. 33 No. 3, pp. 409-423. https://doi.org/10.1108/LHT-06-2015-0068
Emerald Group Publishing Limited
Copyright © 2015, Emerald Group Publishing Limited