Search results
1 – 10 of 54Joseph Nockels, Paul Gooding and Melissa Terras
This paper focuses on image-to-text manuscript processing through Handwritten Text Recognition (HTR), a Machine Learning (ML) approach enabled by Artificial Intelligence (AI)…
Abstract
Purpose
This paper focuses on image-to-text manuscript processing through Handwritten Text Recognition (HTR), a Machine Learning (ML) approach enabled by Artificial Intelligence (AI). With HTR now achieving high levels of accuracy, we consider its potential impact on our near-future information environment and knowledge of the past.
Design/methodology/approach
In undertaking a more constructivist analysis, we identified gaps in the current literature through a Grounded Theory Method (GTM). This guided an iterative process of concept mapping through writing sprints in workshop settings. We identified, explored and confirmed themes through group discussion and a further interrogation of relevant literature, until reaching saturation.
Findings
Catalogued as part of our GTM, 120 published texts underpin this paper. We found that HTR facilitates accurate transcription and dataset cleaning, while facilitating access to a variety of historical material. HTR contributes to a virtuous cycle of dataset production and can inform the development of online cataloguing. However, current limitations include dependency on digitisation pipelines, potential archival history omission and entrenchment of bias. We also cite near-future HTR considerations. These include encouraging open access, integrating advanced AI processes and metadata extraction; legal and moral issues surrounding copyright and data ethics; crediting individuals’ transcription contributions and HTR’s environmental costs.
Originality/value
Our research produces a set of best practice recommendations for researchers, data providers and memory institutions, surrounding HTR use. This forms an initial, though not comprehensive, blueprint for directing future HTR research. In pursuing this, the narrative that HTR’s speed and efficiency will simply transform scholarship in archives is deconstructed.
Details
Keywords
Foteini Valeonti, Melissa Terras and Andrew Hudson-Smith
In recent years, OpenGLAM and the broader open license movement have been gaining momentum in the cultural heritage sector. The purpose of this paper is to examine OpenGLAM from…
Abstract
Purpose
In recent years, OpenGLAM and the broader open license movement have been gaining momentum in the cultural heritage sector. The purpose of this paper is to examine OpenGLAM from the perspective of end users, identifying barriers for commercial and non-commercial reuse of openly licensed art images.
Design/methodology/approach
Following a review of the literature, the authors scope out how end users can discover institutions participating in OpenGLAM, and use case studies to examine the process they must follow to find, obtain and reuse openly licensed images from three art museums.
Findings
Academic literature has so far focussed on examining the risks and benefits of participation from an institutional perspective, with little done to assess OpenGLAM from the end users’ standpoint. The authors reveal that end users have to overcome a series of barriers to find, obtain and reuse open images. The three main barriers relate to image quality, image tracking and the difficulty of distinguishing open images from those that are bound by copyright.
Research limitations/implications
This study focusses solely on the examination of art museums and galleries. Libraries, archives and also other types of OpenGLAM museums (e.g. archaeological) stretch beyond the scope of this paper.
Practical implications
The authors identify practical barriers of commercial and non-commercial reuse of open images, outlining areas of improvement for participant institutions.
Originality/value
The authors contribute to the understudied field of research examining OpenGLAM from the end users’ perspective, outlining recommendations for end users, as well as for museums and galleries.
Details
Keywords
Guenter Muehlberger, Louise Seaward, Melissa Terras, Sofia Ares Oliveira, Vicente Bosch, Maximilian Bryan, Sebastian Colutto, Hervé Déjean, Markus Diem, Stefan Fiel, Basilis Gatos, Albert Greinoecker, Tobias Grüning, Guenter Hackl, Vili Haukkovaara, Gerhard Heyer, Lauri Hirvonen, Tobias Hodel, Matti Jokinen, Philip Kahle, Mario Kallio, Frederic Kaplan, Florian Kleber, Roger Labahn, Eva Maria Lang, Sören Laube, Gundram Leifert, Georgios Louloudis, Rory McNicholl, Jean-Luc Meunier, Johannes Michael, Elena Mühlbauer, Nathanael Philipp, Ioannis Pratikakis, Joan Puigcerver Pérez, Hannelore Putz, George Retsinas, Verónica Romero, Robert Sablatnig, Joan Andreu Sánchez, Philip Schofield, Giorgos Sfikas, Christian Sieber, Nikolaos Stamatopoulos, Tobias Strauß, Tamara Terbul, Alejandro Héctor Toselli, Berthold Ulreich, Mauricio Villegas, Enrique Vidal, Johanna Walcher, Max Weidemann, Herbert Wurster and Konstantinos Zagoris
An overview of the current use of handwritten text recognition (HTR) on archival manuscript material, as provided by the EU H2020 funded Transkribus platform. It explains HTR…
Abstract
Purpose
An overview of the current use of handwritten text recognition (HTR) on archival manuscript material, as provided by the EU H2020 funded Transkribus platform. It explains HTR, demonstrates Transkribus, gives examples of use cases, highlights the affect HTR may have on scholarship, and evidences this turning point of the advanced use of digitised heritage content. The paper aims to discuss these issues.
Design/methodology/approach
This paper adopts a case study approach, using the development and delivery of the one openly available HTR platform for manuscript material.
Findings
Transkribus has demonstrated that HTR is now a useable technology that can be employed in conjunction with mass digitisation to generate accurate transcripts of archival material. Use cases are demonstrated, and a cooperative model is suggested as a way to ensure sustainability and scaling of the platform. However, funding and resourcing issues are identified.
Research limitations/implications
The paper presents results from projects: further user studies could be undertaken involving interviews, surveys, etc.
Practical implications
Only HTR provided via Transkribus is covered: however, this is the only publicly available platform for HTR on individual collections of historical documents at time of writing and it represents the current state-of-the-art in this field.
Social implications
The increased access to information contained within historical texts has the potential to be transformational for both institutions and individuals.
Originality/value
This is the first published overview of how HTR is used by a wide archival studies community, reporting and showcasing current application of handwriting technology in the cultural heritage sector.
Details
Keywords
Paul Gooding, Melissa Terras and Linda Berube
To date, there has been little research into users of the Legal Deposit Libraries (Non-Print Works) Regulations 2013. This paper addresses that gap by presenting key findings from…
Abstract
Purpose
To date, there has been little research into users of the Legal Deposit Libraries (Non-Print Works) Regulations 2013. This paper addresses that gap by presenting key findings from the AHRC-funded Digital Library Futures project. Its purpose is to present a “user-centric” perspective on the potential future impact of the digital collections that are being created under electronic legal deposit regulations.
Design/methodology/approach
The study utilises a mixed methods case study of two academic legal deposit libraries in the United Kingdom: The Bodleian Libraries, University of Oxford; and Cambridge University Library. It combines surveys of users, web log analysis and expert interviews with librarians and cognate professionals.
Findings
User perspectives on NPLD were not fully considered in the planning and implementation of the 2013 regulations. The authors present findings from their user survey to show how contemporary tensions between user behaviour and access protocols risk limiting the instrumental value of NPLD collections, which have high perceived legacy value.
Originality/value
This is the first study to address the user context for UK Non-Print Legal Deposit. Its value lies in presenting a research-led user assessment of NPLD and in proposing “user-centric” analysis as an addition to the existing “four pillars” of legal deposit research.
Details
Keywords
Since its launch in 2007, research has been carried out on the popular social networking website Tumblr. The purpose of this paper is to identify published Tumblr-based research…
Abstract
Purpose
Since its launch in 2007, research has been carried out on the popular social networking website Tumblr. The purpose of this paper is to identify published Tumblr-based research, classify it to understand approaches and methods, and provide methodological recommendations for others.
Design/methodology/approach
Research regarding Tumblr was identified. Following a review of the literature, a classification scheme was adapted and applied, to understand research focus. Papers were quantitatively classified using open coded content analysis of method, subject, approach, and topic.
Findings
The majority of published work relating to Tumblr concentrates on conceptual issues, followed by aspects of the messages sent. This has evolved over time. Perceived benefits are the platform’s long-form text posts, ability to track tags, and the multimodal nature of the platform. Severe research limitations are caused by the lack of demographic, geo-spatial, and temporal metadata attached to individual posts, the limited Advanced Programming Interface, restricted access to data, and the large amounts of ephemeral posts on the site.
Research limitations/implications
This study focusses on Tumblr: the applicability of the approach to other media is not considered. The authors focus on published research and conference papers: there will be book content which was not found using the method. Tumblr as a platform has falling user numbers which may be of concern to researchers.
Practical implications
The authors identify practical barriers to research on the Tumblr platform including lack of metadata and access to big data, explaining why Tumblr is not as popular as Twitter in academic studies.
Social implications
This paper highlights the breadth of topics covered by social media researchers, which allows us to understand popular online platforms.
Originality/value
There has not yet been an overarching study to look at the methods and purpose of those who study Tumblr. The authors identify Tumblr-related research papers from the first appearing in 2011 July until 2015 July. The classification derived here provides a framework that can be used to analyse social media research, and in which to position Tumblr-related work, with recommendations on benefits and limitations of the platform for researchers.
Details
Keywords
The purpose of this paper is to conduct a retrospective bibliometric analysis of documents about digital humanities, an emerging but interdisciplinary movement. It examines the…
Abstract
Purpose
The purpose of this paper is to conduct a retrospective bibliometric analysis of documents about digital humanities, an emerging but interdisciplinary movement. It examines the distribution of research outputs and languages, identifies the active journals and institutions, dissects the network of categories and cited references, and interprets the hot research topics.
Design/methodology/approach
The source data are derived from the Web of Science (WoS) core collection. To reveal the holistic landscape of this field, VOSviewer and CiteSpace as popular visualization tools are employed to process the bibliographic data including author, category, reference, and keyword. Furthermore, the parameter design of the visualization tools follows the general procedures and methods for bibliometric analysis.
Findings
There is an obviously rapid growth in digital humanities research. English is still the leading academic language in this field. The most influential authors all come from or have scientific relationships with Europe and North America, and two leading countries of which are the UK and USA. Digital humanities is the result of a dynamic dialogue between humanistic exploration and digital means. This research field is closely associated with history, literary and cultural heritage, and information and library science.
Research limitations/implications
This analysis relies on the metadata information extracted from the WoS database; however, some valuable literatures in the field of digital humanities may not be retrieved from the database owing to the inherent challenge of topic search. This study is also restricted by the scope of publications, the limitation regarding the source of data is that WoS database may have underrepresented publications in this domain.
Originality/value
The output of this paper could be a valuable reference for researchers and practitioners interesting in the knowledge domain of digital humanities. Moreover, the conclusions of this retrospective analysis can be deemed as the comparable foundation for future study.
Details
Keywords
This paper aims to provide an overview of the development of a computer system designed to aid historians in the reading of the stylus tablets from the Roman fort of Vindolanda…
Abstract
Purpose
This paper aims to provide an overview of the development of a computer system designed to aid historians in the reading of the stylus tablets from the Roman fort of Vindolanda. It proposes outlining the different stages in developing the system, and giving the preliminary results.
Design/methodology/approach
The paper provides a literature review regarding Vindolanda, stylus tablets, and the process of reading an ancient document. Knowledge elicitation techniques are used to model explicitly expert processes used to read an ancient document. A corpus of character forms and lexicostatistics is gathered. An advanced cognitive imaging system utilising artificial intelligence techniques is implemented to produce plausible interpretations of the document.
Findings
This paper describes the developmental stages undertaken to construct a system that can read in images of an ancient document and produce plausible interpretations of the document, to aid the historians in the lengthy process of reading an ancient text. In carrying out the development, an explicit representation of how experts approach and reason about damaged and deteriorated texts was formulated, and a large corpus of letter forms and linguistic data were captured. Preliminary results from the resulting computer system are presented which demonstrate the usefulness of the technique, although more work is needed to develop this into a stand‐alone computer system.
Research limitations/implications
The study is focused on the Roman stylus tablets from Vindolanda, near Hadrian's Wall, although the technique could be extrapolated to cover other types of ancient documents from any period.
Practical implications
It is demonstrated that using techniques from artificial intelligence and cognitive psychology can result in further explicit understanding of humanities expert processes, which allow computational systems to be constructed. The resulting computational system is a tool for the humanities expert, which carries out a task in a similar manner, allowing for faster reasoning time and quicker hypotheses development.
Originality/value
The paper presents the first known system to intake an image of an ancient text and output a plausible interpretation of the text in a reasonable time frame, assisting the papyrologist in resolving ambiguities in the damaged and abraded text.
Details
Keywords
The purpose of this paper is to explore the current state of the text encoding initiative (TEI) community and suggests directions in which that community should strive based on…
Abstract
Purpose
The purpose of this paper is to explore the current state of the text encoding initiative (TEI) community and suggests directions in which that community should strive based on recommendations from experts in the field.
Design/methodology/approach
Looks at the history of, the present state of and future of TEI.
Findings
This column is simply exploratory, and examines issues regarding the TEI and the TEI consortium.
Practical implications
TEI is a very robust and expressive markup language used in the analysis of literature in the humanities fields. The community is encouraged to take proactive steps to ensure TEI as a viable markup language for the next 20 years, at least.
Originality/value
This column examines the enormous contribution that TEI has made to the humanities fields and explores ways in which the usage of TEI, even by non‐experts, can be expanded in order to enrich scholarship.
Details
Keywords
The purpose of this paper is to situate the activity of digitisation to increase access to cultural and heritage content alongside the objectives of the Open Access Movement…
Abstract
Purpose
The purpose of this paper is to situate the activity of digitisation to increase access to cultural and heritage content alongside the objectives of the Open Access Movement (OAM). It demonstrates that increasingly open licensing of digital cultural heritage content is creating opportunities for researchers in the arts and humanities for both access to and analysis of cultural heritage materials.
Design/methodology/approach
The paper is primarily a literature and scoping review of the current digitisation licensing climate, using and embedding examples from ongoing research projects and recent writings on Open Access (OA) and digitisation to highlight both opportunities and barriers to the creation and use of digital heritage content from galleries, libraries, archives and museums (GLAM).
Findings
The digital information environment in which digitised content is created and delivered has changed phenomenally, allowing the sharing and reuse of digital data and encouraging new advances in research across the sector, although issues of licensing persist. There remain further opportunities for understanding how to: study use and users of openly available cultural and heritage content; disseminate and encourage the uptake of open cultural data; persuade other institutions to contribute their data into the commons in an open and accessible manner; build aggregation and search facilities to link across information sources to allow resource discovery; and how best to use high-performance computing facilities to analyse and process the large amounts of data the author is now seeing being made available throughout the sector.
Research limitations/implications
It is hoped that by pulling together this discussion, the benefits to making material openly available have been made clear, encouraging others in the GLAM sector to consider making their collections openly available for reuse and repurposing.
Practical implications
This paper will encourage others in the GLAM sector to consider licensing their collections in an open and reusable fashion. By spelling out the range of opportunities for researchers in using open cultural and heritage materials it makes a contribution to the discussion in this area.
Social implications
Increasing the quantity of high-quality OA resources in the cultural heritage sector will lead to a richer research environment which will increase the understanding of history, culture and society.
Originality/value
This paper has pulled together, for the first time, an overview of the current state of affairs of digitisation in the cultural and heritage sector seen through the context of the OAM. It has highlighted opportunities for researchers in the arts, humanities and social and historical sciences in the embedding of open cultural data into both their research and teaching, whilst scoping the wave of cultural heritage content which is being created from institutional repositories which are now available for research and use. As such, it is a position paper that encourages the open data agenda within the cultural and heritage sector, showing the potentials that exists for the study of culture and society when data are made open.
Details
Keywords
Abstract
Details