Search results

1 – 10 of over 20000
Article
Publication date: 21 September 2015

Yan Han

The purpose of this paper is to introduce PDF/A to replace TIFF as the preferred file format for digitization of textual documents. In addition, PDF/A can be used as an open…

1358

Abstract

Purpose

The purpose of this paper is to introduce PDF/A to replace TIFF as the preferred file format for digitization of textual documents. In addition, PDF/A can be used as an open archival information system (OAIS) submission information package (SIP) container to reduce digitization and digital preservation costs.

Design/methodology/approach

The author first reviewed the current digitization guidelines, the OAIS model and provides on an overview of the development PDF and PDF/A as international standards. Then literature review of the uses of PDF/A is presented. The author analyzed pitfalls of TIFFs as the preferred format for digitization, and showed how to use PDF/A to code digitization SIP.

Findings

TIFF file format has been the preferred master file format by Federal Agency Digitization Guidelines Initiative digitization guidelines for the past 20 years. However, there are drawbacks of TIFF format. Literature reviews show that PDF/A has been the preferred standard for coding born-digital documents in court, government and business sectors. PDF/A-2 and PDF/A-3 are relatively new standards released after 2010. However, few understood the standards and have utilized the full potentials in digitization. The author shows that PDF/A can be used as an OAIS SIP container.

Practical implications

In order to delivery OAIS SIPs, current practices require a combination of files, directories and various types of metadata. The author shows that PDF/A (PDF/A-2 and/or PDF/A-3) can be a better file format for textual document digitization with coding various types of metadata in extensible metadata platform and arbitrary file/data can be coded in PDF/A-3. These features in PDF/A provide much better ways to deliver SIPs in a cost-efficient manner.

Originality/value

PDF/A has been recognized as the preferred standard for born-digital documents, but it has not been used as the preferred file format for digitized materials. The author recommends that: PDF/A with lossless JPX compressions as the preferred file format; and PDF/A with lossless JPX compressions along with metadata/data as the preferred OAIS SIP container. As a result, the uses reduce costs in digitization and digital preservation and also increase productivity. The author recommends to update the national and international digitization practices using PDF/A.

Details

Library Hi Tech, vol. 33 no. 3
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 8 May 2017

Carl Wilson, Rebecca McGuinness and Joachim Jung

This paper describes the development of the veraPDF validator. The objective of veraPDF is to build an industry supported, open source validator for all parts and conformance…

Abstract

Purpose

This paper describes the development of the veraPDF validator. The objective of veraPDF is to build an industry supported, open source validator for all parts and conformance levels of the PDF/A specification for archival PDF documents. The project is led by the Open Preservation Foundation and the PDF Association and is funded by the EU PREFORMA project.

Design/methodology/approach

veraPDF is designed to meet the needs of the digital preservation community and the PDF industry alike. The technology is subject to the review of and acceptance by the PDF Association’s PDF Validation Technical Working Group, including many participants of the relevant ISO working groups. Cultural heritage institutions are collecting ever-increasing volumes of digital information, which they have a mandate to preserve for the long term. However, in many cases, they need to ensure their content has been produced to the specifications of a standard file format, as well as any acceptance criteria stated in their institutional policy.

Findings

With increasing knowledge and experience of processes and policies, cultural heritage institutions are influencing the production and development of digital preservation software. The product development funded by the PREFORMA project shows how such cooperation can benefit the community as a whole.

Originality/value

This paper describes the value of an open source approach to developing a PDF/A validator for cultural heritage organisations.

Details

Digital Library Perspectives, vol. 33 no. 2
Type: Research Article
ISSN: 2059-5816

Keywords

Article
Publication date: 15 June 2015

Miquel Termens, Mireia Ribera and Anita Locher

The purpose of this paper is to analyze the file formats of the digital objects stored in two of the largest open-access repositories in Spain, DDUB and TDX, and determines the…

1345

Abstract

Purpose

The purpose of this paper is to analyze the file formats of the digital objects stored in two of the largest open-access repositories in Spain, DDUB and TDX, and determines the implications of these formats for long-term preservation, focussing in particular on the different versions of PDF.

Design/methodology/approach

To be able to study the two repositories, the authors harvested all the files corresponding to every digital object and some of their associated metadata using the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) and Open Archives Initiative Object Reuse and Exchange (OAI-ORE) protocols. The file formats were analyzed with DROID software and some additional tools.

Findings

The results show that there is no alignment between the preservation policies declared by institutions, the technical tools available, and the actual stored files.

Originality/value

The results show that file controls currently applied to institutional repositories do not suffice to grant their stated mission of long-term preservation of scientific literature.

Details

Library Hi Tech, vol. 33 no. 2
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 20 November 2009

Michael Seadle

The purpose of this paper is to consider whether PDF formats are appropriate for long‐term digital archiving.

1521

Abstract

Purpose

The purpose of this paper is to consider whether PDF formats are appropriate for long‐term digital archiving.

Design/methodology/approach

The approach takes the form of examining how well PDF's capabilities fit eReader devices that future scholars may use in addition to or instead of paper print‐outs.

Findings

Fixity is the advantage that PDF offers for archiving, while its alternatives generally offer greater flexibility for eReader devices. The question for long‐term digital archiving is whether fixity or flexibility best suits the interests of future readers?

Originality/value

PDF is widely accepted as a digital archiving format and PDF documents are found in virtually every repository. There has, however, been little discussion as to whether the fixed format is not in fact a long‐term disadvantage.

Details

Library Hi Tech, vol. 27 no. 4
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 6 June 2018

Roland Erwin Suri and Mohamed El-Saad

Changes in file format specifications challenge long-term preservation of digital documents. Digital archives thus often focus on specific file formats that are well suited for…

1838

Abstract

Purpose

Changes in file format specifications challenge long-term preservation of digital documents. Digital archives thus often focus on specific file formats that are well suited for long-term preservation, such as the PDF/A format. Since only few customers submit PDF/A files, digital archives may consider converting submitted files to the PDF/A format. The paper aims to discuss these issues.

Design/methodology/approach

The authors evaluated three software tools for batch conversion of common file formats to PDF/A-1b: LuraTech PDF Compressor, Adobe Acrobat XI Pro and 3-HeightsTM Document Converter by PDF Tools. The test set consisted of 80 files, with 10 files each of the eight file types JPEG, MS PowerPoint, PDF, PNG, MS Word, MS Excel, MSG and “web page.”

Findings

Batch processing was sometimes hindered by stops that required manual interference. Depending on the software tool, three to four of these stops occurred during batch processing of the 80 test files. Furthermore, the conversion tools sometimes failed to produce output files even for supported file formats: three (Adobe Pro) up to seven (LuraTech and 3-HeightsTM) PDF/A-1b files were not produced. Since Adobe Pro does not convert e-mails, a total of 213 PDF/A-1b files were produced. The faithfulness of each conversion was investigated by comparing the visual appearance of the input document with that of the produced PDF/A-1b document on a computer screen. Meticulous visual inspection revealed that the conversion to PDF/A-1b impaired the information content in 24 of the converted 213 files (11 percent). These reproducibility errors included loss of links, loss of other document content (unreadable characters, missing text, document part missing), updated fields (reflecting time and folder of conversion), vector graphics issues and spelling errors.

Originality/value

These results indicate that large-scale batch conversions of heterogeneous files to PDF/A-1b cause complex issues that need to be addressed for each individual file. Even with considerable efforts, some information loss seems unavoidable if large numbers of files from heterogeneous sources are migrated to the PDF/A-1b format.

Details

Library Hi Tech, vol. 39 no. 2
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 9 September 2014

Quan Lu, Gao Liu and Jing Chen

The purpose of this paper is to propose a novel approach to integrate portable document format (PDF) interface into Java-based digital library application. It bridges the gap…

Abstract

Purpose

The purpose of this paper is to propose a novel approach to integrate portable document format (PDF) interface into Java-based digital library application. It bridges the gap between conducting content operation and viewing on PDF document asynchronously.

Design/methodology/approach

In this paper, the authors first review some related research and discuss PDF and its drawbacks. Next, the authors propose the design steps and implementation of three modes of displaying PDF document: PDF display, image display and extensible markup language (XML) display. A comparison of these three modes has been carried out.

Findings

The authors find that the PDF display is able to completely present the original PDF document contents and thus obviously superior to the other two displays. In addition, the format specification of PDF-based e-book does not perform well; lack of standardization and complex structure is exposed to the publication.

Practical implications

The proposed approach makes viewing the PDF documents more convenient and effective, and can be used to retrieve and visualize the PDF documents and to support the personalized function customization of PDF in the digital library applications.

Originality/value

This paper proposes a novel approach to solve the problem between content operation and the view of PDF synchronously, providing users a new tool to retrieve and reuse the PDF documents. It contributes to improve the service specification and policy of viewing the PDF for digital library. Besides, the personalized interface and public index make further development and application more feasible.

Details

Library Hi Tech, vol. 32 no. 3
Type: Research Article
ISSN: 0737-8831

Keywords

Book part
Publication date: 30 September 2019

Audrey N. Scarlata, Kelly L. Williams and Brandon Vagner

The increasing availability of eXtensible Business Reporting Language (XBRL) financial statements motivates additional investigation of whether XBRL’s search-facilitating…

Abstract

The increasing availability of eXtensible Business Reporting Language (XBRL) financial statements motivates additional investigation of whether XBRL’s search-facilitating technology (SFT) and enhanced viewing capabilities facilitate information search and improve financial analysis decision quality and efficiency. This experiment investigates how using XBRL technology to view financial statements influences novice investors’ decision quality by affecting decision processes such as search strategy and effort, as well as decision efficiency (accuracy/effort) in a financial statement analysis task. In the experiment, randomly assigned student participants (n = 102) invested in companies using either static PDF-formatted or XBRL-enabled financial statements. No differences in decision quality (i.e., accuracy) due to technology use were observed. However, participants in the XBRL condition examined less information, used more directed search processes, and evidenced greater efficiency than did participants assigned to the PDF condition. Hence, the results suggest that XBRL SFT affects the use of differing decision processes relative to PDF technology.

Details

Advances in Accounting Behavioral Research
Type: Book
ISBN: 978-1-83867-346-8

Keywords

Article
Publication date: 2 May 2017

Jacqueline L. Birt, Kala Muthusamy and Poonam Bir

eXtensible Business Reporting Language (XBRL) is an internet-based interactive form of reporting language that is expected to enhance the usefulness of financial reporting (Yuan…

3213

Abstract

Purpose

eXtensible Business Reporting Language (XBRL) is an internet-based interactive form of reporting language that is expected to enhance the usefulness of financial reporting (Yuan and Wang, 2009). In the UK and the USA, XBRL is mandatory, and in Australia, it is voluntarily adopted. It has been reported that in the not too distant future, XBRL will be the standard format for the preparation and exchange of business reports (Gettler, 2015). Using an experimental approach, this study assesses the usefulness of financial reports with XBRL tagged information compared to PDF format information for non-professional investors. The authors investigate participants’ perceptions of usefulness in relation to the qualitative characteristics of relevance, understandability and comparability.

Design/methodology/approach

This paper uses an experimental approach featuring a profit-forecasting task to determine if participants perceive XBRL-tagged information to be more useful compared to PDF-formatted information.

Findings

Results reveal that financial information presented with XBRL tagging is significantly more relevant, understandable and comparable to non-professional investors.

Originality/value

The authors address a gap in the literature by examining XBRL usefulness in Australia where XBRL adoption will be mandated within the not too distant future. Currently, the voluntary adoption of XBRL by preparers and users is low, possibly, because of a lack of awareness about XBRL and its potential benefits. This study yields significant implications for the accounting regulators in creating more awareness on the benefits of using XBRL and to create an impetus for XBRL adoption.

Details

Accounting Research Journal, vol. 30 no. 01
Type: Research Article
ISSN: 1030-9616

Keywords

Article
Publication date: 31 October 2018

Julius T. Nganji

This paper aims to suggest how the information journey of students with disabilities could be facilitated, by first revealing the existence of inaccessible formats such as…

Abstract

Purpose

This paper aims to suggest how the information journey of students with disabilities could be facilitated, by first revealing the existence of inaccessible formats such as Portable Document Format (PDF) and then suggesting the inclusion of alternative formats of accessible learning materials, thus improving retrieval.

Design/methodology/approach

A sample of 400 articles published over 10 years (2009-2018) from four journals are selected and analysed for accessibility against the Web Content Accessibility Guidelines WCAG 2.0 by using automated accessibility checkers, a screen reader and manual human expertise. The results are presented and recommendations made on improving accessibility.

Findings

The findings suggest that the PDF versions of the selected journal articles are not accessible for screen reader users but could be improved by adopting accessible and inclusive practices. Including alternative formats of the learning materials could help support the student information journey.

Research limitations/implications

The results of the study might not be very representative of all the articles in the journals given the small sample size. Additionally, the criteria used in the study do not consider all existing disabilities. Thus, although the PDFs may be inaccessible for some people with disabilities, they may be accessible to others.

Practical implications

Given that PDFs seem to be the preferred format of journal articles online, there is potential for a difficult information journey for some students due to the limitations posed by inaccessibility of the PDFs. Thus, it is recommended to include alternative formats which could be more accessible, giving the student the choice of accessing the learning materials in their preferred format.

Social implications

If students are unable to access the learning materials that are required for their course, this could lead to poor grade, which might negatively affect the students’ morale. In some cases, some students might drop out.

Originality/value

This study analyses the accessibility of learning materials provided by a third party (journal publishers) and how they affect the student, something that is not usually given much importance when research in accessibility is carried out.

Details

Information and Learning Science, vol. 119 no. 12
Type: Research Article
ISSN: 2398-5348

Keywords

Article
Publication date: 1 April 1996

Judith Wustman

On the heels of the rapid growth of the World Wide Web have come advances in multimedia document formats and the hardware and software to support them. As a result of this…

Abstract

On the heels of the rapid growth of the World Wide Web have come advances in multimedia document formats and the hardware and software to support them. As a result of this combination of factors, the electronic journal is, at last, economically and aesthetically viable.

Details

Program, vol. 30 no. 4
Type: Research Article
ISSN: 0033-0337

1 – 10 of over 20000