Search results
1 – 2 of 2Sigal Arie Erez, Tobias Blanke, Mike Bryant, Kepa Rodriguez, Reto Speck and Veerle Vanden Daelen
This paper aims to describe the European Holocaust Research Infrastructure (EHRI) project's ongoing efforts to virtually integrate trans-national archival sources via the…
Abstract
Purpose
This paper aims to describe the European Holocaust Research Infrastructure (EHRI) project's ongoing efforts to virtually integrate trans-national archival sources via the reconstruction of collection provenance as it relates to copy collections (material copied from one archive to another) and the co-referencing of subject and authority terms across material held by distinct institutions.
Design/methodology/approach
This paper is a case study of approximately 6,000 words length. The authors describe the scope of the problem of archival fragmentation from both cultural and technical perspectives, with particular focus on Holocaust-related material, and describe, with graph-based visualisations, two ways in which EHRI seeks to better integrate information about fragmented material.
Findings
As a case study, the principal contributions of this paper include reports on our experience with extracting provenance-based connections between archival descriptions from encoded finding aids and the challenges of co-referencing access points in the absence of domain-specific controlled vocabularies.
Originality/value
Record linking in general is an important technique in computational approaches to humanities research and one that has rightly received significant attention from scholars. In the context of historical archives, however, the material itself is in most cases not digitised, meaning that computational attempts at linking must rely on finding aids which constitute much fewer rich data sources. The EHRI project’s work in this area is therefore quite pioneering and has implications for archival integration on a larger scale, where the disruptive potential of Linked Open Data is most obvious.
Details
Keywords
Tobias Blanke, Michael Bryant and Reto Speck
In 2010 the European Holocaust Research Infrastructure (EHRI) was funded to support research into the Holocaust. The project follows on from significant efforts in the past to…
Abstract
Purpose
In 2010 the European Holocaust Research Infrastructure (EHRI) was funded to support research into the Holocaust. The project follows on from significant efforts in the past to develop and record the collections of the Holocaust in several national initiatives. The purpose of this paper is to introduce the efforts by EHRI to create a flexible research environment using graph databases. The authors concentrate on the added features and design decisions to enable efficient processing of collection information as a graph.
Design/methodology/approach
The paper concentrates on the specific customisations EHRI had to develop, as the graph database approach is new, and the authors could not rely on existing solutions. The authors describe the serialisations of collections in the graph to provide for efficient processing. Because the EHRI infrastructure is highly distributed, the authors also had to invest a lot of effort into reliable distributed access control mechanisms. Finally, the authors analyse the user-facing work on a portal and a virtual research environment (VRE) in order to discover, share and analyse Holocaust material.
Findings
Using the novel graph database approach, the authors first present how we can model collection information as graphs and why this is effective. Second, we show how we make collection information persistent and describe the complex access management system we have developed. Third, we outline how we integrate user interaction with the data through a VRE.
Originality/value
Scholars require specialised access to information. The authors present the results of the work to develop integrated research with collections on the Holocaust researchers and the proposals for a socio-technical ecosystem based on graph database technologies. The use of graph databases is new and the authors needed to work on several innovative customisations to make them work in the domain.
Details