Search results
1 – 10 of 777This paper aims to suggest how the information journey of students with disabilities could be facilitated, by first revealing the existence of inaccessible formats such as…
Abstract
Purpose
This paper aims to suggest how the information journey of students with disabilities could be facilitated, by first revealing the existence of inaccessible formats such as Portable Document Format (PDF) and then suggesting the inclusion of alternative formats of accessible learning materials, thus improving retrieval.
Design/methodology/approach
A sample of 400 articles published over 10 years (2009-2018) from four journals are selected and analysed for accessibility against the Web Content Accessibility Guidelines WCAG 2.0 by using automated accessibility checkers, a screen reader and manual human expertise. The results are presented and recommendations made on improving accessibility.
Findings
The findings suggest that the PDF versions of the selected journal articles are not accessible for screen reader users but could be improved by adopting accessible and inclusive practices. Including alternative formats of the learning materials could help support the student information journey.
Research limitations/implications
The results of the study might not be very representative of all the articles in the journals given the small sample size. Additionally, the criteria used in the study do not consider all existing disabilities. Thus, although the PDFs may be inaccessible for some people with disabilities, they may be accessible to others.
Practical implications
Given that PDFs seem to be the preferred format of journal articles online, there is potential for a difficult information journey for some students due to the limitations posed by inaccessibility of the PDFs. Thus, it is recommended to include alternative formats which could be more accessible, giving the student the choice of accessing the learning materials in their preferred format.
Social implications
If students are unable to access the learning materials that are required for their course, this could lead to poor grade, which might negatively affect the students’ morale. In some cases, some students might drop out.
Originality/value
This study analyses the accessibility of learning materials provided by a third party (journal publishers) and how they affect the student, something that is not usually given much importance when research in accessibility is carried out.
Details
Keywords
Abstract
Purpose
The purpose of this paper is to propose a novel approach to integrate portable document format (PDF) interface into Java-based digital library application. It bridges the gap between conducting content operation and viewing on PDF document asynchronously.
Design/methodology/approach
In this paper, the authors first review some related research and discuss PDF and its drawbacks. Next, the authors propose the design steps and implementation of three modes of displaying PDF document: PDF display, image display and extensible markup language (XML) display. A comparison of these three modes has been carried out.
Findings
The authors find that the PDF display is able to completely present the original PDF document contents and thus obviously superior to the other two displays. In addition, the format specification of PDF-based e-book does not perform well; lack of standardization and complex structure is exposed to the publication.
Practical implications
The proposed approach makes viewing the PDF documents more convenient and effective, and can be used to retrieve and visualize the PDF documents and to support the personalized function customization of PDF in the digital library applications.
Originality/value
This paper proposes a novel approach to solve the problem between content operation and the view of PDF synchronously, providing users a new tool to retrieve and reuse the PDF documents. It contributes to improve the service specification and policy of viewing the PDF for digital library. Besides, the personalized interface and public index make further development and application more feasible.
Details
Keywords
The purpose of this paper is to consider whether PDF formats are appropriate for long‐term digital archiving.
Abstract
Purpose
The purpose of this paper is to consider whether PDF formats are appropriate for long‐term digital archiving.
Design/methodology/approach
The approach takes the form of examining how well PDF's capabilities fit eReader devices that future scholars may use in addition to or instead of paper print‐outs.
Findings
Fixity is the advantage that PDF offers for archiving, while its alternatives generally offer greater flexibility for eReader devices. The question for long‐term digital archiving is whether fixity or flexibility best suits the interests of future readers?
Originality/value
PDF is widely accepted as a digital archiving format and PDF documents are found in virtually every repository. There has, however, been little discussion as to whether the fixed format is not in fact a long‐term disadvantage.
Details
Keywords
This article sets out to explain the purpose of PDF/A, how it addresses archival and records management concerns, how PDF/A was designed to have “desirable properties of a…
Abstract
Purpose
This article sets out to explain the purpose of PDF/A, how it addresses archival and records management concerns, how PDF/A was designed to have “desirable properties of a long‐term preservation format”, and the future of PDF/A.
Design/methodology/approach
The contents of this article are based on the author's knowledge and experience of the subject.
Findings
It is emphasized that PDF/A must be implemented in conjunction with policies and procedures, including quality assurance procedures to ensure acceptable replication of source material.
Originality/value
This article will be of interest to anyone working with PDF files. Work has already begun on PDF/A Part 2 which will be based on PDF 1.6. Application notes and a listing of frequently asked questions will be made publicly available to assist developers of PDF/A applications to better understand the requirements of the file format and provide implementation guidance.
Details
Keywords
Acrobat, Envoy and Common Ground all launched commercially within a few months of each other in 1994, as did a format called Farallon Replica that will no longer be marketed from…
The use of “open data” can help the public find value in various areas of interests. Many governments have created and published a huge amount of open data; however, people have a…
Abstract
Purpose
The use of “open data” can help the public find value in various areas of interests. Many governments have created and published a huge amount of open data; however, people have a hard time using open data because of data quality issues. The UK, the USA and Korea have created and published open data; however, the rate of open data implementation and level of open data impact is very low because of data quality issues like incompatible data formats and incomplete data. This study aims to compare the statuses of data quality from open government sites in the UK, the USA and Korea and also present guidelines for publishing data format and enhancing data completeness.
Design/methodology/approach
This study uses statistical analysis of different data formats and examination of data completeness to explore key issues of data quality in open government data.
Findings
Findings show that the USA and the UK have published more than 50 per cent of open data in level one. Korea has published 52.8 per cent of data in level three. Level one data are not machine-readable; therefore, users have a hard time using them. The level one data are found in portable document format and hyper text markup language (HTML) and are locked up in documents; therefore, machines cannot extract out the data. Findings show that incomplete data are existing in all three governments’ open data.
Originality/value
Governments should investigate data incompleteness of all open data and correct incomplete data of the most used data. Governments can find the most used data easily by monitoring data sets that have been downloaded most frequently over a certain period.
Details
Keywords
Wen-Feng Hsiao, Te-Min Chang and Erwin Thomas
The purpose of this paper is to propose an automatic metadata extraction and retrieval system to extract bibliographical information from digital academic documents in portable…
Abstract
Purpose
The purpose of this paper is to propose an automatic metadata extraction and retrieval system to extract bibliographical information from digital academic documents in portable document formats (PDFs).
Design/methodology/approach
The authors use PDFBox to extract text and font size information, a rule-based method to identify titles, and an Hidden Markov Model (HMM) to extract the titles and authors. Finally, the extracted titles and authors (possibly incorrect or incomplete) are sent as query strings to digital libraries (e.g. ACM, IEEE, CiteSeerX, SDOS, and Google Scholar) to retrieve the rest of metadata.
Findings
Four experiments are conducted to examine the feasibility of the proposed system. The first experiment compares two different HMM models: multi-state model and one state model (the proposed model). The result shows that one state model can have a comparable performance with multi-state model, but is more suitable to deal with real-world unknown states. The second experiment shows that our proposed model (without the aid of online query) can achieve as good performance as other researcher's model on Cora paper header dataset. In the third experiment the paper examines the performance of our system on a small dataset of 43 real PDF research papers. The result shows that our proposed system (with online query) can perform pretty well on bibliographical data extraction and even outperform the free citation management tool Zotero 3.0. Finally, the paper conducts the fourth experiment with a larger dataset of 103 papers to compare our system with Zotero 4.0. The result shows that our system significantly outperforms Zotero 4.0. The feasibility of the proposed model is thus justified.
Research limitations/implications
For academic implication, the system is unique in two folds: first, the system only uses Cora header set for HMM training, without using other tagged datasets or gazetteers resources, which means the system is light and scalable. Second, the system is workable and can be applied to extracting metadata of real-world PDF files. The extracted bibliographical data can then be imported into citation software such as endnote or refworks to increase researchers’ productivity.
Practical implications
For practical implication, the system can outperform the existing tool, Zotero v4.0. This provides practitioners good chances to develop similar products in real applications; though it might require some knowledge about HMM implementation.
Originality/value
The HMM implementation is not novel. What is innovative is that it actually combines two HMM models. The main model is adapted from Freitag and Mccallum (1999) and the authors add word features of the Nymble HMM (Bikel et al, 1997) to it. The system is workable even without manually tagging the datasets before training the model (the authors just use cora dataset to train and test on real-world PDF papers), as this is significantly different from what other works have done so far. The experimental results have shown sufficient evidence about the feasibility of our proposed method in this aspect.
Details
Keywords
Qian Pu, Xiaomin Zhu, Donghua Chen and Runtong Zhang
This paper aims to provide an optimization method of workflow for publishing houses and electronic book (e-book) studies in the field of digital publishing.
Abstract
Purpose
This paper aims to provide an optimization method of workflow for publishing houses and electronic book (e-book) studies in the field of digital publishing.
Design/methodology/approach
Based on the studies of publishing houses in Beijing, the present conversion workflow is illustrated using a functional modeling methodology. Then, the workflow is analyzed using 5W1H (why, who, what, where, when, how) methodology and optimized using ECRSI (eliminate, combine, rearrange, simplify and increase) principles. To validate the optimization effect, the workflow before and after optimization are generated and implemented by the ExtendSim® simulation software.
Findings
The simulation results show that under similar circumstances, both quantity and quality of the products are improved after optimization, which indicate that the optimization method is effective.
Practical implications
Electronic PUBlication (EPUB) has significant requirements to satisfy the needs of the mobile reading market and to earn increased profits, whereas some e-books are still preserved in a portable document format (PDF). This study results in the enhanced EPUB quality and production efficiency of the PDF-to-EPUB format conversion workflow in publishing houses. Publishing houses around the world can refer to this study to make a similar optimization when handling PDF-to-EPUB.
Originality/value
This research introduces the traditional industrial engineering analytical techniques to the workflow optimization of e-book conversion. Compared with the most of other methods used to optimize workflow, this method is simpler, more efficient and more suitable for e-book format conversion.
Details
Keywords
This paper mainly discusses the author's prototype implementation of Java‐based electronic publishing system (JEPS) that facilitates the creation and delivery of electronic…
Abstract
This paper mainly discusses the author's prototype implementation of Java‐based electronic publishing system (JEPS) that facilitates the creation and delivery of electronic documents with Java technology. JEPS packages the document and viewer in a Java applet. The documents can be viewed on any computer platform with the identical content and style. This paper describes the framework of JEPS and compares JEPS with other Web publishing technologies such as PDF and XML. This paper concludes by considering the potential opportunities and prospects that JEPS provides in the area of electronic publishing over the Internet.
Details
Keywords