Search results

1 – 10 of 11
Open Access
Article
Publication date: 23 May 2023

Kimmo Kettunen, Heikki Keskustalo, Sanna Kumpulainen, Tuula Pääkkönen and Juha Rautiainen

This study aims to identify user perception of different qualities of optical character recognition (OCR) in texts. The purpose of this paper is to study the effect of different…

Abstract

Purpose

This study aims to identify user perception of different qualities of optical character recognition (OCR) in texts. The purpose of this paper is to study the effect of different quality OCR on users' subjective perception through an interactive information retrieval task with a collection of one digitized historical Finnish newspaper.

Design/methodology/approach

This study is based on the simulated work task model used in interactive information retrieval. Thirty-two users made searches to an article collection of Finnish newspaper Uusi Suometar 1869–1918 which consists of ca. 1.45 million autosegmented articles. The article search database had two versions of each article with different quality OCR. Each user performed six pre-formulated and six self-formulated short queries and evaluated subjectively the top 10 results using a graded relevance scale of 0–3. Users were not informed about the OCR quality differences of the otherwise identical articles.

Findings

The main result of the study is that improved OCR quality affects subjective user perception of historical newspaper articles positively: higher relevance scores are given to better-quality texts.

Originality/value

To the best of the authors’ knowledge, this simulated interactive work task experiment is the first one showing empirically that users' subjective relevance assessments are affected by a change in the quality of an optically read text.

Details

Journal of Documentation, vol. 79 no. 7
Type: Research Article
ISSN: 0022-0418

Keywords

Open Access
Article
Publication date: 31 July 2023

Sara Lafia, David A. Bleckley and J. Trent Alexander

Many libraries and archives maintain collections of research documents, such as administrative records, with paper-based formats that limit the documents' access to in-person use…

Abstract

Purpose

Many libraries and archives maintain collections of research documents, such as administrative records, with paper-based formats that limit the documents' access to in-person use. Digitization transforms paper-based collections into more accessible and analyzable formats. As collections are digitized, there is an opportunity to incorporate deep learning techniques, such as Document Image Analysis (DIA), into workflows to increase the usability of information extracted from archival documents. This paper describes the authors' approach using digital scanning, optical character recognition (OCR) and deep learning to create a digital archive of administrative records related to the mortgage guarantee program of the Servicemen's Readjustment Act of 1944, also known as the G.I. Bill.

Design/methodology/approach

The authors used a collection of 25,744 semi-structured paper-based records from the administration of G.I. Bill Mortgages from 1946 to 1954 to develop a digitization and processing workflow. These records include the name and city of the mortgagor, the amount of the mortgage, the location of the Reconstruction Finance Corporation agent, one or more identification numbers and the name and location of the bank handling the loan. The authors extracted structured information from these scanned historical records in order to create a tabular data file and link them to other authoritative individual-level data sources.

Findings

The authors compared the flexible character accuracy of five OCR methods. The authors then compared the character error rate (CER) of three text extraction approaches (regular expressions, DIA and named entity recognition (NER)). The authors were able to obtain the highest quality structured text output using DIA with the Layout Parser toolkit by post-processing with regular expressions. Through this project, the authors demonstrate how DIA can improve the digitization of administrative records to automatically produce a structured data resource for researchers and the public.

Originality/value

The authors' workflow is readily transferable to other archival digitization projects. Through the use of digital scanning, OCR and DIA processes, the authors created the first digital microdata file of administrative records related to the G.I. Bill mortgage guarantee program available to researchers and the general public. These records offer research insights into the lives of veterans who benefited from loans, the impacts on the communities built by the loans and the institutions that implemented them.

Details

Journal of Documentation, vol. 79 no. 7
Type: Research Article
ISSN: 0022-0418

Keywords

Open Access
Article
Publication date: 20 February 2024

Alenka Kavčič Čolić and Andreja Hari

The current predominant delivery format resulting from digitization is PDF, which is not appropriate for the blind, partially sighted and people who read on mobile devices. To…

Abstract

Purpose

The current predominant delivery format resulting from digitization is PDF, which is not appropriate for the blind, partially sighted and people who read on mobile devices. To meet the needs of both communities, as well as broader ones, alternative file formats are required. With the findings of the eBooks-On-Demand-Network Opening Publications for European Netizens project research, this study aims to improve access to digitized content for these communities.

Design/methodology/approach

In 2022, the authors conducted research on the digitization experiences of 13 EODOPEN partners at their organizations. The authors distributed the same sample of scans in English with different characteristics, and in accordance with Web content accessibility guidelines, the authors created 24 criteria to analyze their digitization workflows, output formats and optical character recognition (OCR) quality.

Findings

In this contribution, the authors present the results of a trial implementation among EODOPEN partners regarding their digitization workflows, used delivery file formats and the resulting quality of OCR results, depending on the type of digitization output file format. It was shown that partners using the OCR tool ABBYY FineReader Professional and producing scanning outputs in tagged PDF and PDF/UA formats achieved better results according to set criteria.

Research limitations/implications

The trial implementations were limited to 13 project partners’ organizations only.

Originality/value

This research paper can be a valuable contribution to the field of massive digitization practices, particularly in terms of improving the accessibility of the output delivery file formats.

Details

Digital Library Perspectives, vol. 40 no. 2
Type: Research Article
ISSN: 2059-5816

Keywords

Open Access
Article
Publication date: 18 April 2024

Joseph Nockels, Paul Gooding and Melissa Terras

This paper focuses on image-to-text manuscript processing through Handwritten Text Recognition (HTR), a Machine Learning (ML) approach enabled by Artificial Intelligence (AI)…

Abstract

Purpose

This paper focuses on image-to-text manuscript processing through Handwritten Text Recognition (HTR), a Machine Learning (ML) approach enabled by Artificial Intelligence (AI). With HTR now achieving high levels of accuracy, we consider its potential impact on our near-future information environment and knowledge of the past.

Design/methodology/approach

In undertaking a more constructivist analysis, we identified gaps in the current literature through a Grounded Theory Method (GTM). This guided an iterative process of concept mapping through writing sprints in workshop settings. We identified, explored and confirmed themes through group discussion and a further interrogation of relevant literature, until reaching saturation.

Findings

Catalogued as part of our GTM, 120 published texts underpin this paper. We found that HTR facilitates accurate transcription and dataset cleaning, while facilitating access to a variety of historical material. HTR contributes to a virtuous cycle of dataset production and can inform the development of online cataloguing. However, current limitations include dependency on digitisation pipelines, potential archival history omission and entrenchment of bias. We also cite near-future HTR considerations. These include encouraging open access, integrating advanced AI processes and metadata extraction; legal and moral issues surrounding copyright and data ethics; crediting individuals’ transcription contributions and HTR’s environmental costs.

Originality/value

Our research produces a set of best practice recommendations for researchers, data providers and memory institutions, surrounding HTR use. This forms an initial, though not comprehensive, blueprint for directing future HTR research. In pursuing this, the narrative that HTR’s speed and efficiency will simply transform scholarship in archives is deconstructed.

Open Access
Article
Publication date: 14 May 2024

Ying Hu and Feng’e Zheng

The ancient town of Lijiang is a representative place of ethnic minorities in China’s southwest border area jointly built by many ethnic groups. Its rich and diversified history…

Abstract

Purpose

The ancient town of Lijiang is a representative place of ethnic minorities in China’s southwest border area jointly built by many ethnic groups. Its rich and diversified history, culture and architecture as well as its artistic and spiritual values need to be better retained and explored.

Design/methodology/approach

The protection and inheritance of Lijiang’s cultural heritage will be improved through the construction of digital memory resources. To guide Lijiang’s digital memory construction, this study explores strategies of digital memory construction by analyzing four case studies of well-known memory projects from China and America.

Findings

From the case studies analysis, factors of digital memory construction were identified and compared. Factors led to the discussion of strategies for constructing the digital memory of Lijiang within its design, construction and service phases.

Originality/value

The ancient town of Lijiang is a famous historical and cultural city in China, and it is also a representative place of ethnic minorities in the border area jointly built by many ethnic groups. The rich culture should be preserved and digitalized to offer better use for the whole nation.

Details

Digital Transformation and Society, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2755-0761

Keywords

Open Access
Article
Publication date: 21 October 2022

Amber L. Cushing and Giulia Osti

This study aims to explore the implementation of artificial intelligence (AI) in archival practice by presenting the thoughts and opinions of working archival practitioners. It…

5753

Abstract

Purpose

This study aims to explore the implementation of artificial intelligence (AI) in archival practice by presenting the thoughts and opinions of working archival practitioners. It contributes to the extant literature with a fresh perspective, expanding the discussion on AI adoption by investigating how it influences the perceptions of digital archival expertise.

Design/methodology/approach

In this study a two-phase data collection consisting of four online focus groups was held to gather the opinions of international archives and digital preservation professionals (n = 16), that participated on a volunteer basis. The qualitative analysis of the transcripts was performed using template analysis, a style of thematic analysis.

Findings

Four main themes were identified: fitting AI into day to day practice; the responsible use of (AI) technology; managing expectations (about AI adoption) and bias associated with the use of AI. The analysis suggests that AI adoption combined with hindsight about digitisation as a disruptive technology might provide archival practitioners with a framework for re-defining, advocating and outlining digital archival expertise.

Research limitations/implications

The volunteer basis of this study meant that the sample was not representative or generalisable.

Originality/value

Although the results of this research are not generalisable, they shed light on the challenges prospected by the implementation of AI in the archives and for the digital curation professionals dealing with this change. The evolution of the characterisation of digital archival expertise is a topic reserved for future research.

Details

Journal of Documentation, vol. 79 no. 7
Type: Research Article
ISSN: 0022-0418

Keywords

Open Access
Article
Publication date: 7 August 2023

Tiziano Volpentesta, Esli Spahiu and Pietro De Giovanni

Digital transformation (DT) is a major challenge for incumbent organisations, as research on this phenomenon has revealed a high failure rate. Given this consideration, this paper…

2325

Abstract

Purpose

Digital transformation (DT) is a major challenge for incumbent organisations, as research on this phenomenon has revealed a high failure rate. Given this consideration, this paper reviews the literature on DT in incumbent organisations to identify the main themes and research directions to be undertaken.

Design/methodology/approach

The authors adopt a systematic literature review (SLR) and computational literature review (CLR) employing a machine learning algorithm for topic modelling (LDA) to surface the themes discussed in 103 peer-reviewed studies published between 2010 and 2022 in a multidisciplinary article sample.

Findings

The authors identify and discuss the five main themes emerging from the studies, offering the state-of-the-art of DT in established firms' literature. The authors find that the most discussed topics revolve around the DT of healthcare, the process of renewal and change, the project management, the changes in value performances and capabilities and the consequences on the products of DT. Accordingly, the authors identify the topics overlooked by literature that future studies could tackle, which concern sustainability and contextualisation of the DT phenomenon.

Practical implications

The authors further propose managerial insights which equip managers with a revolutionary mindset that is not constraining but, rather, integration-seeking. DT is not only about technology (Tabrizi B et al., 2019). Successful DT initiatives require managerial capabilities that foster a sustainable departure from the current organising logic (Markus, 2004). This study pinpoints and prioritises the role that paradox-informed thinking can have to sustain an effective digital mindset (Eden et al., 2018) that allows for the building of momentum in DT initiatives and facilitates the renewal process. Indeed, managers lagging behind DT could shift from an “either-or” solutions mindset where one pole is preferred over the other (e.g. digital or physical) to embracing a “both-and-with” thinking balancing between poles (e.g. digital and physical) to successfully fuse the digital and the legacy (Lewis and Smith, 2022b; Smith, Lewis and Edmondson, 2022), enact the renewal, and build and maintain momentum for DTs. The outcomes of adopting a paradox mindset in managerial practice are enabling learning and creativity, fostering flexibility and resilience and, finally, unleashing human potential (Lewis and Smith, 2014).

Social implications

The authors propose insight that will equip managers with a mindset that will allow DT to fail less often than current reported rates, which failure may imply potential organisational collapse, financial bankrupt and social crisis.

Originality/value

The authors offer a multidisciplinary review of the DT complementing existing reviews due to the focus on the organisational context of established organisations. Moreover, the authors advance paradoxical thinking as a novel lens through which to study DT in incumbent organisations by proposing an array of potential research questions and new avenues for research. Finally, the authors offer insights for managers to help them thrive in DT by adopting a paradoxical mindset.

Details

European Journal of Innovation Management, vol. 26 no. 7
Type: Research Article
ISSN: 1460-1060

Keywords

Open Access
Article
Publication date: 16 December 2022

Sean Bradley Power and Niamh M. Brennan

Annual general meetings have been variously described as dull rituals for accountability versus entertaining theatre at the expense of accountability. The research analyses…

1329

Abstract

Purpose

Annual general meetings have been variously described as dull rituals for accountability versus entertaining theatre at the expense of accountability. The research analyses director and shareholder participation and dialogic interactions at annual and extraordinary general meetings of Cecil Rhodes' British South Africa Company (BSAC). The BSAC was incorporated under a royal charter in 1889 in return for power to exploit a huge territory, Rhodesia/now Zimbabwe. The BSAC's administration ceased in 1924/25. Thus, the BSAC had a dual mandate as a private for-profit listed company and to occupy and develop the territories on behalf of the British government.

Design/methodology/approach

The article analyses 29 BSAC general meeting minutes, comprising 25 full sets of verbatim minutes between 1895 and 1925. The study adopts manual content analysis. First, the research adopts conversational analysis to analyse director and shareholder turn-taking and moves by approving and dissenting shareholders. Second, the study identifies and analyses incidents of shareholder sentiment from the shareholder turns/moves. Finally, the article assesses how shareholder sentiment changed throughout the period and whether the BSAC's share price reflected the shareholder sentiment.

Findings

The BSAC's general meetings were associated with the greater colonial project of building the British Empire. The authors find almost 1,500 incidents of shareholder sentiment. Directors and shareholders take roughly an equal number of turns (excluding shareholder sentiment). Ritual and ceremony dominate director and shareholder turns and moves, while accountability to shareholders was minimal. The BSAC share price spiked in the early years of the project, waning after that. Shareholder sentiment, both positive and negative, reflect the share price behaviour.

Originality/value

A unique database of verbatim general meeting minutes records shareholders' reactions to what they heard in the form of sounding off through cheering, “hear, hears,” laughter and applause (i.e. shareholder sentiment).

Details

Accounting, Auditing & Accountability Journal, vol. 36 no. 9
Type: Research Article
ISSN: 0951-3574

Keywords

Open Access
Article
Publication date: 28 July 2023

Harshleen Kaur Duggal, Puja Khatri, Asha Thomas and Marco Pironti

Massive open online courses (MOOCs), a Taylorist attempt to automate instruction, help make course delivery more efficient, economical and better. As an implementation of Digital…

Abstract

Purpose

Massive open online courses (MOOCs), a Taylorist attempt to automate instruction, help make course delivery more efficient, economical and better. As an implementation of Digital Taylorism Implementation (DTI), MOOCs enable individuals to obtain an occupation-oriented education, equipping them with knowledge and skills needed to stay employable. However, learning through online platforms can induce tremendous amounts of technology-related stress in learners such as complexity of platforms and fears of redundancy. Thus, the aim of this paper is to study how student perceptions of DTI and technostress (TS) influence their perceived employability (PE). The role of TS as a mediator between DTI and PE has also been studied.

Design/methodology/approach

Stratified sampling technique has been used to obtain data from 305 students from 6 universities. The effect of DTI and TS on PE, and the role of TS as a mediator, has been examined using the partial least squares (PLS) structural equation modelling approach with SMART PLS 4.0. software. Predictive relevance of the model has been studied using PLSPredict.

Findings

Results indicate that TS completely mediates the relationship between DTI and PE. The model has medium predictive relevance.

Practical implications

Learning outcomes from Digitally Taylored programs can be improved with certain reforms that bring the human touch to online learning.

Originality/value

This study extends Taylorism literature by linking DTI to PE of students via technostress as a mediator.

Details

Journal of Management History, vol. 30 no. 2
Type: Research Article
ISSN: 1751-1348

Keywords

Open Access
Article
Publication date: 30 October 2023

Koraljka Golub, Xu Tan, Ying-Hsang Liu and Jukka Tyrkkö

This exploratory study aims to help contribute to the understanding of online information search behaviour of PhD students from different humanities fields, with a focus on…

Abstract

Purpose

This exploratory study aims to help contribute to the understanding of online information search behaviour of PhD students from different humanities fields, with a focus on subject searching.

Design/methodology/approach

The methodology is based on a semi-structured interview within which the participants are asked to conduct both a controlled search task and a free search task. The sample comprises eight PhD students in several humanities disciplines at Linnaeus University, a medium-sized Swedish university from 2020.

Findings

Most humanities PhD students in the study have received training in information searching, but it has been too basic. Most rely on web search engines like Google and Google Scholar for publications' search, and university's discovery system for known-item searching. As these systems do not rely on controlled vocabularies, the participants often struggle with too many retrieved documents that are not relevant. Most only rarely or never use disciplinary bibliographic databases. The controlled search task has shown some benefits of using controlled vocabularies in the disciplinary databases, but incomplete synonym or concept coverage as well as user unfriendly search interface present hindrances.

Originality/value

The paper illuminates an often-forgotten but pervasive challenge of subject searching, especially for humanities researchers. It demonstrates difficulties and shows how most PhD students have missed finding an important resource in their research. It calls for the need to reconsider training in information searching and the need to make use of controlled vocabularies implemented in various search systems with usable search and browse user interfaces.

Access

Only content I have access to

Year

Last 6 months (11)

Content type

1 – 10 of 11