Search results

1 – 10 of 46

Open Access

Article

Publication date: 23 May 2023

Optical character recognition quality affects subjective user perception of historical newspaper clippings

Kimmo Kettunen, Heikki Keskustalo, Sanna Kumpulainen, Tuula Pääkkönen and Juha Rautiainen

This study aims to identify user perception of different qualities of optical character recognition (OCR) in texts. The purpose of this paper is to study the effect of different…

HTML

PDF (2.8 MB)

Downloads

656

Abstract

Purpose

This study aims to identify user perception of different qualities of optical character recognition (OCR) in texts. The purpose of this paper is to study the effect of different quality OCR on users' subjective perception through an interactive information retrieval task with a collection of one digitized historical Finnish newspaper.

Design/methodology/approach

This study is based on the simulated work task model used in interactive information retrieval. Thirty-two users made searches to an article collection of Finnish newspaper Uusi Suometar 1869–1918 which consists of ca. 1.45 million autosegmented articles. The article search database had two versions of each article with different quality OCR. Each user performed six pre-formulated and six self-formulated short queries and evaluated subjectively the top 10 results using a graded relevance scale of 0–3. Users were not informed about the OCR quality differences of the otherwise identical articles.

Findings

The main result of the study is that improved OCR quality affects subjective user perception of historical newspaper articles positively: higher relevance scores are given to better-quality texts.

Originality/value

To the best of the authors’ knowledge, this simulated interactive work task experiment is the first one showing empirically that users' subjective relevance assessments are affected by a change in the quality of an optically read text.

Details

Journal of Documentation, vol. 79 no. 7

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

Open Access

Article

Publication date: 31 July 2023

Digitizing and parsing semi-structured historical administrative documents from the G.I. Bill mortgage guarantee program

Sara Lafia, David A. Bleckley and J. Trent Alexander

Many libraries and archives maintain collections of research documents, such as administrative records, with paper-based formats that limit the documents' access to in-person use…

HTML

PDF (1.3 MB)

Downloads

612

Abstract

Purpose

Many libraries and archives maintain collections of research documents, such as administrative records, with paper-based formats that limit the documents' access to in-person use. Digitization transforms paper-based collections into more accessible and analyzable formats. As collections are digitized, there is an opportunity to incorporate deep learning techniques, such as Document Image Analysis (DIA), into workflows to increase the usability of information extracted from archival documents. This paper describes the authors' approach using digital scanning, optical character recognition (OCR) and deep learning to create a digital archive of administrative records related to the mortgage guarantee program of the Servicemen's Readjustment Act of 1944, also known as the G.I. Bill.

Design/methodology/approach

The authors used a collection of 25,744 semi-structured paper-based records from the administration of G.I. Bill Mortgages from 1946 to 1954 to develop a digitization and processing workflow. These records include the name and city of the mortgagor, the amount of the mortgage, the location of the Reconstruction Finance Corporation agent, one or more identification numbers and the name and location of the bank handling the loan. The authors extracted structured information from these scanned historical records in order to create a tabular data file and link them to other authoritative individual-level data sources.

Findings

The authors compared the flexible character accuracy of five OCR methods. The authors then compared the character error rate (CER) of three text extraction approaches (regular expressions, DIA and named entity recognition (NER)). The authors were able to obtain the highest quality structured text output using DIA with the Layout Parser toolkit by post-processing with regular expressions. Through this project, the authors demonstrate how DIA can improve the digitization of administrative records to automatically produce a structured data resource for researchers and the public.

Originality/value

The authors' workflow is readily transferable to other archival digitization projects. Through the use of digital scanning, OCR and DIA processes, the authors created the first digital microdata file of administrative records related to the G.I. Bill mortgage guarantee program available to researchers and the general public. These records offer research insights into the lives of veterans who benefited from loans, the impacts on the communities built by the loans and the institutions that implemented them.

Details

Journal of Documentation, vol. 79 no. 7

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 10 April 2023

metaGraphos: a Web-based system for transcribing, proofreading and publishing scanned documents

Evagelos Varthis and Marios Poulos

This study aims to present metaGraphos, a crowdsourcing system that aids in the transcription and semantic enhancement of scanned documents by using a pool of volunteers or people…

HTML

PDF (1.2 MB)

Downloads

209

Abstract

Purpose

This study aims to present metaGraphos, a crowdsourcing system that aids in the transcription and semantic enhancement of scanned documents by using a pool of volunteers or people willing to participate in exchange for a financial reward.

Design/methodology/approach

The metaGraphos can be used in circumstances where optical character recognition fails to produce satisfactory results, semantic tagging or assigning thematic headings to texts is considered necessary or even when ground-truth data has to be collected in raw form.

Findings

The system automatically provides a Web-based interface comprising a static HTML page and JavaScript code that displays the scanned images of the document, coupled with the corresponding incomplete texts side by side, allowing users to correct or complete the texts in parallel.

Social implications

By assisting the parallel transcription and the semantic enhancement of difficult scanned documents, the system further reveals the hidden cultural wealth and aids in knowledge dissemination, a fact that contributes significantly to the academic-scientific dialog and feedback.

Originality/value

Individual researchers, libraries and organizations in general may benefit from the system because it is cost-effective, practical and simple to set up client–server architecture that provides a reliable way to transcribe texts or revise transcriptions on a large scale.

Details

Collection and Curation, vol. 42 no. 4

Type: Research Article

DOI:

ISSN: 2514-9326

Keywords

Open Access

Article

Publication date: 18 April 2024

The implications of handwritten text recognition for accessing the past at scale

Joseph Nockels, Paul Gooding and Melissa Terras

This paper focuses on image-to-text manuscript processing through Handwritten Text Recognition (HTR), a Machine Learning (ML) approach enabled by Artificial Intelligence (AI)…

HTML

PDF (230 KB)

Downloads

341

Abstract

Purpose

This paper focuses on image-to-text manuscript processing through Handwritten Text Recognition (HTR), a Machine Learning (ML) approach enabled by Artificial Intelligence (AI). With HTR now achieving high levels of accuracy, we consider its potential impact on our near-future information environment and knowledge of the past.

Design/methodology/approach

In undertaking a more constructivist analysis, we identified gaps in the current literature through a Grounded Theory Method (GTM). This guided an iterative process of concept mapping through writing sprints in workshop settings. We identified, explored and confirmed themes through group discussion and a further interrogation of relevant literature, until reaching saturation.

Findings

Catalogued as part of our GTM, 120 published texts underpin this paper. We found that HTR facilitates accurate transcription and dataset cleaning, while facilitating access to a variety of historical material. HTR contributes to a virtuous cycle of dataset production and can inform the development of online cataloguing. However, current limitations include dependency on digitisation pipelines, potential archival history omission and entrenchment of bias. We also cite near-future HTR considerations. These include encouraging open access, integrating advanced AI processes and metadata extraction; legal and moral issues surrounding copyright and data ethics; crediting individuals’ transcription contributions and HTR’s environmental costs.

Originality/value

Our research produces a set of best practice recommendations for researchers, data providers and memory institutions, surrounding HTR use. This forms an initial, though not comprehensive, blueprint for directing future HTR research. In pursuing this, the narrative that HTR’s speed and efficiency will simply transform scholarship in archives is deconstructed.

Details

Journal of Documentation, vol. 80 no. 7

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 10 April 2023

Information consolidation and repackaging for augmented reality library service: a special reference to the Layar app

Santosh Abaji Kharat, Shubhada Nagarkar and Bhausaheb Panage

The purpose of this research is to introduce the Layar augmented reality (AR) application among library users and to understand the user’s satisfaction towards the information…

HTML

PDF (983 KB)

Downloads

262

Abstract

Purpose

The purpose of this research is to introduce the Layar augmented reality (AR) application among library users and to understand the user’s satisfaction towards the information services provided by the Layar application with the help of the structural equation model (SEM).

Design/methodology/approach

According to Thomas (2016), action research is mainly undertaken to develop new skills or new approaches and to solve issues and problems with direct application to any applied setting. The present study helps to develop new skills and approaches to repackaging information using AR applications. Researchers have identified the question of what could be done to increase the awareness of Layar AR applications among students. Because the Layar augmented application is one of the new tools for an academic library to repackage information for mass accessibility. Therefore, in the present action research approach, researchers encompass two activities action and research. Researchers have used participatory action research methods by collecting data from 17 MBA institute libraries affiliated with Savitribai Phule Pune University. Researchers have systematically used the Layar application in the library by obtaining permission from each higher authority. Researchers have designed a Layar satisfaction model using the SEM with AMOS and SPSS.

Findings

The researcher found that the relationship between experience, performance and service quality is positively significant. The user’s experience is satisfied with the Layar application, but users are not satisfied with the service quality and performance of the Layar application.

Research limitations/implications

This study tested Layar AR application in MBA libraries affiliated with Savitribai Phule Pune University in the Pune and Pimpri Chinchwad areas.

Practical implications

The Layar app helps the academic library to convert selected print collections into an AR feel for library users. This is an additional method of providing information services to users through mobile devices. A total of 157 students downloaded the Layar application from their handsets and provided feedback through a questionnaire. Researchers have found that the relationships between users and Layar experience, performance and service quality are positively significant. The user experience is satisfied with the Layar application, but users are not satisfied with the service quality and performance of the Layar application.

Originality/value

This study examined the performance, service quality and user experience of Layar applications. Structural equation and Modelling theories were used to examine the relationship between user satisfaction and information services using the Layar application.

Details

Information Discovery and Delivery, vol. 52 no. 1

Type: Research Article

DOI:

ISSN: 2398-6247

Keywords

View access options

Article

Publication date: 11 September 2023

Multidimensional knowledge discovery of cultural relics resources in the Tang tomb mural category

Ying Gao, Qiang Zhang, Xiaoran Wang, Yanmei Huang, Fanshuang Meng and Wan Tao

Currently, the Tang tomb mural cultural relic resources are presented in a multi-source and heterogeneous manner, with a lack of effective organization and sharing between…

HTML

PDF (2.3 MB)

Downloads

197

Abstract

Purpose

Currently, the Tang tomb mural cultural relic resources are presented in a multi-source and heterogeneous manner, with a lack of effective organization and sharing between resources. Therefore, this study aims to propose a multidimensional knowledge discovery solution for Tang tomb mural cultural relic resources.

Design/methodology/approach

Taking the Tang tomb murals collected by the Shaanxi History Museum as an example, based on clarifying the relevant concepts of Tang tomb mural resources and considering both dynamic and static dimensions, a top-down approach was adopted to first construct an ontology model of Tang tomb mural type cultural relics resources. Then, the actual case data was imported into the Neo4J graph database according to the defined pattern hierarchy to complete the static organization of knowledge, and presented in a multimodal form in knowledge reasoning and retrieval. In addition, geographic information system (GIS) technology is used to dynamically display the spatiotemporal distribution of Tang tomb mural resources, and the distribution trend is analysed from a digital humanistic perspective.

Findings

The multi-dimensional knowledge discovery of Tang tomb mural cultural relics resources can help establish the correlation and spatiotemporal relationship between resources, providing support for semantic retrieval and navigation, knowledge discovery and visualization and so on.

Originality/value

This study takes the murals in the collection of the Shaanxi History Museum as an example, revealing potential knowledge associations in a static and intelligent way, achieving knowledge discovery and management of Tang tomb murals, and dynamically presents the spatial distribution of Tang tomb murals through GIS technology, meeting the knowledge presentation needs of different users and opening up new ideas for the study of Tang tomb murals.

Details

The Electronic Library , vol. 42 no. 1

Type: Research Article

DOI:

ISSN: 0264-0473

Keywords

View access options

Article

Publication date: 16 February 2022

Artificial Intelligence Adoption in the Post COVID-19 New-Normal and Role of Smart Technologies in Transforming Business: a Review

Pragati Agarwal, Sanjeev Swami and Sunita Kumari Malhotra

The purpose of this paper is to give an overview of artificial intelligence (AI) and other AI-enabled technologies and to describe how COVID-19 affects various industries such as…

HTML

PDF (803 KB)

Downloads

3605

Abstract

Purpose

The purpose of this paper is to give an overview of artificial intelligence (AI) and other AI-enabled technologies and to describe how COVID-19 affects various industries such as health care, manufacturing, retail, food services, education, media and entertainment, banking and insurance, travel and tourism. Furthermore, the authors discuss the tactics in which information technology is used to implement business strategies to transform businesses and to incentivise the implementation of these technologies in current or future emergency situations.

Design/methodology/approach

The review provides the rapidly growing literature on the use of smart technology during the current COVID-19 pandemic.

Findings

The 127 empirical articles the authors have identified suggest that 39 forms of smart technologies have been used, ranging from artificial intelligence to computer vision technology. Eight different industries have been identified that are using these technologies, primarily food services and manufacturing. Further, the authors list 40 generalised types of activities that are involved including providing health services, data analysis and communication. To prevent the spread of illness, robots with artificial intelligence are being used to examine patients and give drugs to them. The online execution of teaching practices and simulators have replaced the classroom mode of teaching due to the epidemic. The AI-based Blue-dot algorithm aids in the detection of early warning indications. The AI model detects a patient in respiratory distress based on face detection, face recognition, facial action unit detection, expression recognition, posture, extremity movement analysis, visitation frequency detection, sound pressure detection and light level detection. The above and various other applications are listed throughout the paper.

Research limitations/implications

Research is largely delimited to the area of COVID-19-related studies. Also, bias of selective assessment may be present. In Indian context, advanced technology is yet to be harnessed to its full extent. Also, educational system is yet to be upgraded to add these technologies potential benefits on wider basis.

Practical implications

First, leveraging of insights across various industry sectors to battle the global threat, and smart technology is one of the key takeaways in this field. Second, an integrated framework is recommended for policy making in this area. Lastly, the authors recommend that an internet-based repository should be developed, keeping all the ideas, databases, best practices, dashboard and real-time statistical data.

Originality/value

As the COVID-19 is a relatively recent phenomenon, such a comprehensive review does not exist in the extant literature to the best of the authors’ knowledge. The review is rapidly emerging literature on smart technology use during the current COVID-19 pandemic.

Details

Journal of Science and Technology Policy Management, vol. 15 no. 3

Type: Research Article

DOI:

ISSN: 2053-4620

Keywords

View access options

Article

Publication date: 24 November 2023

The digital transformation processes for supporting digital humanities researchers in text analysis

Ernesto William De Luca, Francesca Fallucchi, Bouchra Ghattas and Riem Spielhaus

This article aims to explore how the mapping strategies between user requirements expressed by the humanities researchers lead to a better customization of user-driven digital…

HTML

PDF (1.7 MB)

Downloads

121

Abstract

Purpose

This article aims to explore how the mapping strategies between user requirements expressed by the humanities researchers lead to a better customization of user-driven digital humanities tools and to the creation of innovative functionalities, which can directly affect the way of doing research in a digital context.

Design/methodology/approach

It describes the user-driven development of a tool that helps researchers in the quantitative and qualitative analysis of large textbook collections.

Findings

This article presents an exemplary user journey map, which shows the different steps of the digital transformation process and how the humanities researchers are involved for (1) producing innovative research solutions, comprehensive and personalized reports, and (2) customizing access to content data used for the analysis of digital documents. The article is based on a case study on a German textbooks collection and content analysis functionalities.

Originality/value

The focus of this article is the reiterative research process, in which humanists (from the human centred point of view) starts from an initial research question, using quantitative and qualitative data and develops both the research question and the answers to it by with the aim to find patterns in the content and structure of educational media. Thus, from the viewpoint of digital transformation the humanist is part of the interaction between digitization and digitalization processes, where he/she uses digital data, metadata, reports and findings created and supported by the digital tools for research analysis.

Details

Journal of Documentation, vol. 80 no. 2

Type: Research Article

DOI:

ISSN: 0022-0418

Keywords

View access options

Article

Publication date: 24 April 2023

Building AgriRef: an AI embedded virtual reference app for farmers

Priya Garg and Shivarama Rao K.

This paper aims to discuss the process of building a 24×7 reference platform for facilitating the farmers with the easy access of information at any time from any location. It…

HTML

PDF (1.4 MB)

Downloads

199

Abstract

Purpose

This paper aims to discuss the process of building a 24×7 reference platform for facilitating the farmers with the easy access of information at any time from any location. It takes the text string as input and process it to respond with the desired result to the user.

Design/methodology/approach

An interactive Web-based chatbot named as AgriRef was developed using free version of Dialogflow. The intents were defined based on the conversation flow diagram. Furthermore, the application was integrated with website on local server and telegram application.

Findings

With this chatbot application, the farmers will able to get answers of their queries. It provides the human-like conversational interface to the farmers. It will also be useful for librarians of agricultural libraries to save time in answering common queries.

Originality/value

This paper describes the various steps involved in developing the chatbot application using Dialogflow.

Details

Library Hi Tech News, vol. 41 no. 2

Type: Research Article

DOI:

ISSN: 0741-9058

Keywords

Open Access

Article

Publication date: 23 October 2023

European approach to remote customer onboarding solutions

Daniel Cookman

This study aims to identify European positioning on the use of remote customer onboarding solutions in combating financial crime.

HTML

PDF (144 KB)

Downloads

607

Abstract

Purpose

This study aims to identify European positioning on the use of remote customer onboarding solutions in combating financial crime.

Design/methodology/approach

This study is a desktop research that examines European Banking Authority (EBA) policy statements relating to the use of innovative solutions in combating financial crime.

Findings

Technological advancements in biometric data and software tools provide a unique opportunity to address potential paper customer onboarding process deficiencies. Electronic remote customer onboarding solutions equip credit, financial institutions and investment firms with an alternative FTE cost-saving solution, in their pursuit of revenue generation. Whilst the EBA and Financial Action Task Force have provided approval for the utilisation of innovative solutions and AML technologies in combatting financial crime. Hesitancy remains on the ability of credit and financial institutions to use technological solutions as a “magic solution” in preventing the materialisation of money laundering/terrorist financing related risks. Analysis of policy suggests a gravitation towards the increased use of the aforementioned technologies in the interim.

Originality/value

Capitalisation of European banking authority.

Details

Journal of Money Laundering Control, vol. 26 no. 7

Type: Research Article

DOI:

ISSN: 1368-5201

Keywords

Access

Year

Content type

Article (46)

1 – 10 of 46

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Social implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Research limitations/implications

Practical implications

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Details

Keywords

Access

Year

Content type

We’re listening — tell us what you think

Something didn’t work…