Search results

1 – 5 of 5
Content available
Book part
Publication date: 24 June 2024

Noel Scott, Brent Moyle, Ana Cláudia Campos, Liubov Skavronskaya and Biqiang Liu

Abstract

Details

Cognitive Psychology and Tourism
Type: Book
ISBN: 978-1-80262-579-0

Content available
Book part
Publication date: 16 January 2024

Yinying Wang

Abstract

Details

Leaders’ Decision Making and Neuroscience
Type: Book
ISBN: 978-1-83797-387-3

Content available
Article
Publication date: 11 February 2014

Dr Bernadette Whelan

888

Abstract

Details

Journal of Historical Research in Marketing, vol. 6 no. 1
Type: Research Article
ISSN: 1755-750X

Content available
Book part
Publication date: 22 August 2019

Brett Lashua

Abstract

Details

Popular Music, Popular Myth and Cultural Heritage in Cleveland: The Moondog, The Buzzard, and the Battle for the Rock and Roll Hall of Fame
Type: Book
ISBN: 978-1-78769-156-8

Open Access
Article
Publication date: 31 July 2023

Sara Lafia, David A. Bleckley and J. Trent Alexander

Many libraries and archives maintain collections of research documents, such as administrative records, with paper-based formats that limit the documents' access to in-person use…

Abstract

Purpose

Many libraries and archives maintain collections of research documents, such as administrative records, with paper-based formats that limit the documents' access to in-person use. Digitization transforms paper-based collections into more accessible and analyzable formats. As collections are digitized, there is an opportunity to incorporate deep learning techniques, such as Document Image Analysis (DIA), into workflows to increase the usability of information extracted from archival documents. This paper describes the authors' approach using digital scanning, optical character recognition (OCR) and deep learning to create a digital archive of administrative records related to the mortgage guarantee program of the Servicemen's Readjustment Act of 1944, also known as the G.I. Bill.

Design/methodology/approach

The authors used a collection of 25,744 semi-structured paper-based records from the administration of G.I. Bill Mortgages from 1946 to 1954 to develop a digitization and processing workflow. These records include the name and city of the mortgagor, the amount of the mortgage, the location of the Reconstruction Finance Corporation agent, one or more identification numbers and the name and location of the bank handling the loan. The authors extracted structured information from these scanned historical records in order to create a tabular data file and link them to other authoritative individual-level data sources.

Findings

The authors compared the flexible character accuracy of five OCR methods. The authors then compared the character error rate (CER) of three text extraction approaches (regular expressions, DIA and named entity recognition (NER)). The authors were able to obtain the highest quality structured text output using DIA with the Layout Parser toolkit by post-processing with regular expressions. Through this project, the authors demonstrate how DIA can improve the digitization of administrative records to automatically produce a structured data resource for researchers and the public.

Originality/value

The authors' workflow is readily transferable to other archival digitization projects. Through the use of digital scanning, OCR and DIA processes, the authors created the first digital microdata file of administrative records related to the G.I. Bill mortgage guarantee program available to researchers and the general public. These records offer research insights into the lives of veterans who benefited from loans, the impacts on the communities built by the loans and the institutions that implemented them.

Details

Journal of Documentation, vol. 79 no. 7
Type: Research Article
ISSN: 0022-0418

Keywords

Access

Only content I have access to

Year

Content type

1 – 5 of 5