To read this content please select one of the options below:

Record linking in the EHRI portal

Sigal Arie Erez (Yad Vashem, Jerusalem, Israel)
Tobias Blanke (University of Amsterdam, Amsterdam, The Netherlands)
Mike Bryant (Department of Digital Humanity, King's College London, London, UK)
Kepa Rodriguez (Yad Vashem, Jerusalem, Israel)
Reto Speck (King's College London – Strand Campus, London, UK)
Veerle Vanden Daelen (Kazerne Dossin Memorial Museum en Documentatiecentrum over Holocaust en Mensenrechten, Mechelen, Belgium)

Records Management Journal

ISSN: 0956-5698

Article publication date: 5 May 2020

Issue publication date: 4 December 2020




This paper aims to describe the European Holocaust Research Infrastructure (EHRI) project's ongoing efforts to virtually integrate trans-national archival sources via the reconstruction of collection provenance as it relates to copy collections (material copied from one archive to another) and the co-referencing of subject and authority terms across material held by distinct institutions.


This paper is a case study of approximately 6,000 words length. The authors describe the scope of the problem of archival fragmentation from both cultural and technical perspectives, with particular focus on Holocaust-related material, and describe, with graph-based visualisations, two ways in which EHRI seeks to better integrate information about fragmented material.


As a case study, the principal contributions of this paper include reports on our experience with extracting provenance-based connections between archival descriptions from encoded finding aids and the challenges of co-referencing access points in the absence of domain-specific controlled vocabularies.


Record linking in general is an important technique in computational approaches to humanities research and one that has rightly received significant attention from scholars. In the context of historical archives, however, the material itself is in most cases not digitised, meaning that computational attempts at linking must rely on finding aids which constitute much fewer rich data sources. The EHRI project’s work in this area is therefore quite pioneering and has implications for archival integration on a larger scale, where the disruptive potential of Linked Open Data is most obvious.



EHRI is a consortium with many partners across numerous countries, and many individuals were involved in the work described herein in addition to the co-authors. In particular, the authors would like to thank Anna Ullrich and Giles Bennett at the Institut für Zeitgeschichte (IfZ) for their efforts in resolving and validating USHMM copy-collection original holding institutions, Linda Reijnhoudt at DANS for her work on formalising copy-original links, the EHRI-teams at Yad Vashem, CEGESOMA, Kazerne Dossin, and the EHRI Data Identification and Integration Work Package in general.


Arie Erez, S., Blanke, T., Bryant, M., Rodriguez, K., Speck, R. and Vanden Daelen, V. (2020), "Record linking in the EHRI portal", Records Management Journal, Vol. 30 No. 3, pp. 363-378.



Emerald Publishing Limited

Copyright © 2020, Emerald Publishing Limited

Related articles