Search results
1 – 10 of 46Kimmo Kettunen, Heikki Keskustalo, Sanna Kumpulainen, Tuula Pääkkönen and Juha Rautiainen
This study aims to identify user perception of different qualities of optical character recognition (OCR) in texts. The purpose of this paper is to study the effect of different…
Abstract
Purpose
This study aims to identify user perception of different qualities of optical character recognition (OCR) in texts. The purpose of this paper is to study the effect of different quality OCR on users' subjective perception through an interactive information retrieval task with a collection of one digitized historical Finnish newspaper.
Design/methodology/approach
This study is based on the simulated work task model used in interactive information retrieval. Thirty-two users made searches to an article collection of Finnish newspaper Uusi Suometar 1869–1918 which consists of ca. 1.45 million autosegmented articles. The article search database had two versions of each article with different quality OCR. Each user performed six pre-formulated and six self-formulated short queries and evaluated subjectively the top 10 results using a graded relevance scale of 0–3. Users were not informed about the OCR quality differences of the otherwise identical articles.
Findings
The main result of the study is that improved OCR quality affects subjective user perception of historical newspaper articles positively: higher relevance scores are given to better-quality texts.
Originality/value
To the best of the authors’ knowledge, this simulated interactive work task experiment is the first one showing empirically that users' subjective relevance assessments are affected by a change in the quality of an optically read text.
Details
Keywords
Sara Lafia, David A. Bleckley and J. Trent Alexander
Many libraries and archives maintain collections of research documents, such as administrative records, with paper-based formats that limit the documents' access to in-person use…
Abstract
Purpose
Many libraries and archives maintain collections of research documents, such as administrative records, with paper-based formats that limit the documents' access to in-person use. Digitization transforms paper-based collections into more accessible and analyzable formats. As collections are digitized, there is an opportunity to incorporate deep learning techniques, such as Document Image Analysis (DIA), into workflows to increase the usability of information extracted from archival documents. This paper describes the authors' approach using digital scanning, optical character recognition (OCR) and deep learning to create a digital archive of administrative records related to the mortgage guarantee program of the Servicemen's Readjustment Act of 1944, also known as the G.I. Bill.
Design/methodology/approach
The authors used a collection of 25,744 semi-structured paper-based records from the administration of G.I. Bill Mortgages from 1946 to 1954 to develop a digitization and processing workflow. These records include the name and city of the mortgagor, the amount of the mortgage, the location of the Reconstruction Finance Corporation agent, one or more identification numbers and the name and location of the bank handling the loan. The authors extracted structured information from these scanned historical records in order to create a tabular data file and link them to other authoritative individual-level data sources.
Findings
The authors compared the flexible character accuracy of five OCR methods. The authors then compared the character error rate (CER) of three text extraction approaches (regular expressions, DIA and named entity recognition (NER)). The authors were able to obtain the highest quality structured text output using DIA with the Layout Parser toolkit by post-processing with regular expressions. Through this project, the authors demonstrate how DIA can improve the digitization of administrative records to automatically produce a structured data resource for researchers and the public.
Originality/value
The authors' workflow is readily transferable to other archival digitization projects. Through the use of digital scanning, OCR and DIA processes, the authors created the first digital microdata file of administrative records related to the G.I. Bill mortgage guarantee program available to researchers and the general public. These records offer research insights into the lives of veterans who benefited from loans, the impacts on the communities built by the loans and the institutions that implemented them.
Details
Keywords
Evagelos Varthis and Marios Poulos
This study aims to present metaGraphos, a crowdsourcing system that aids in the transcription and semantic enhancement of scanned documents by using a pool of volunteers or people…
Abstract
Purpose
This study aims to present metaGraphos, a crowdsourcing system that aids in the transcription and semantic enhancement of scanned documents by using a pool of volunteers or people willing to participate in exchange for a financial reward.
Design/methodology/approach
The metaGraphos can be used in circumstances where optical character recognition fails to produce satisfactory results, semantic tagging or assigning thematic headings to texts is considered necessary or even when ground-truth data has to be collected in raw form.
Findings
The system automatically provides a Web-based interface comprising a static HTML page and JavaScript code that displays the scanned images of the document, coupled with the corresponding incomplete texts side by side, allowing users to correct or complete the texts in parallel.
Social implications
By assisting the parallel transcription and the semantic enhancement of difficult scanned documents, the system further reveals the hidden cultural wealth and aids in knowledge dissemination, a fact that contributes significantly to the academic-scientific dialog and feedback.
Originality/value
Individual researchers, libraries and organizations in general may benefit from the system because it is cost-effective, practical and simple to set up client–server architecture that provides a reliable way to transcribe texts or revise transcriptions on a large scale.
Details
Keywords
Joseph Nockels, Paul Gooding and Melissa Terras
This paper focuses on image-to-text manuscript processing through Handwritten Text Recognition (HTR), a Machine Learning (ML) approach enabled by Artificial Intelligence (AI)…
Abstract
Purpose
This paper focuses on image-to-text manuscript processing through Handwritten Text Recognition (HTR), a Machine Learning (ML) approach enabled by Artificial Intelligence (AI). With HTR now achieving high levels of accuracy, we consider its potential impact on our near-future information environment and knowledge of the past.
Design/methodology/approach
In undertaking a more constructivist analysis, we identified gaps in the current literature through a Grounded Theory Method (GTM). This guided an iterative process of concept mapping through writing sprints in workshop settings. We identified, explored and confirmed themes through group discussion and a further interrogation of relevant literature, until reaching saturation.
Findings
Catalogued as part of our GTM, 120 published texts underpin this paper. We found that HTR facilitates accurate transcription and dataset cleaning, while facilitating access to a variety of historical material. HTR contributes to a virtuous cycle of dataset production and can inform the development of online cataloguing. However, current limitations include dependency on digitisation pipelines, potential archival history omission and entrenchment of bias. We also cite near-future HTR considerations. These include encouraging open access, integrating advanced AI processes and metadata extraction; legal and moral issues surrounding copyright and data ethics; crediting individuals’ transcription contributions and HTR’s environmental costs.
Originality/value
Our research produces a set of best practice recommendations for researchers, data providers and memory institutions, surrounding HTR use. This forms an initial, though not comprehensive, blueprint for directing future HTR research. In pursuing this, the narrative that HTR’s speed and efficiency will simply transform scholarship in archives is deconstructed.
Details
Keywords
Santosh Abaji Kharat, Shubhada Nagarkar and Bhausaheb Panage
The purpose of this research is to introduce the Layar augmented reality (AR) application among library users and to understand the user’s satisfaction towards the information…
Abstract
Purpose
The purpose of this research is to introduce the Layar augmented reality (AR) application among library users and to understand the user’s satisfaction towards the information services provided by the Layar application with the help of the structural equation model (SEM).
Design/methodology/approach
According to Thomas (2016), action research is mainly undertaken to develop new skills or new approaches and to solve issues and problems with direct application to any applied setting. The present study helps to develop new skills and approaches to repackaging information using AR applications. Researchers have identified the question of what could be done to increase the awareness of Layar AR applications among students. Because the Layar augmented application is one of the new tools for an academic library to repackage information for mass accessibility. Therefore, in the present action research approach, researchers encompass two activities action and research. Researchers have used participatory action research methods by collecting data from 17 MBA institute libraries affiliated with Savitribai Phule Pune University. Researchers have systematically used the Layar application in the library by obtaining permission from each higher authority. Researchers have designed a Layar satisfaction model using the SEM with AMOS and SPSS.
Findings
The researcher found that the relationship between experience, performance and service quality is positively significant. The user’s experience is satisfied with the Layar application, but users are not satisfied with the service quality and performance of the Layar application.
Research limitations/implications
This study tested Layar AR application in MBA libraries affiliated with Savitribai Phule Pune University in the Pune and Pimpri Chinchwad areas.
Practical implications
The Layar app helps the academic library to convert selected print collections into an AR feel for library users. This is an additional method of providing information services to users through mobile devices. A total of 157 students downloaded the Layar application from their handsets and provided feedback through a questionnaire. Researchers have found that the relationships between users and Layar experience, performance and service quality are positively significant. The user experience is satisfied with the Layar application, but users are not satisfied with the service quality and performance of the Layar application.
Originality/value
This study examined the performance, service quality and user experience of Layar applications. Structural equation and Modelling theories were used to examine the relationship between user satisfaction and information services using the Layar application.
Details
Keywords
Ying Gao, Qiang Zhang, Xiaoran Wang, Yanmei Huang, Fanshuang Meng and Wan Tao
Currently, the Tang tomb mural cultural relic resources are presented in a multi-source and heterogeneous manner, with a lack of effective organization and sharing between…
Abstract
Purpose
Currently, the Tang tomb mural cultural relic resources are presented in a multi-source and heterogeneous manner, with a lack of effective organization and sharing between resources. Therefore, this study aims to propose a multidimensional knowledge discovery solution for Tang tomb mural cultural relic resources.
Design/methodology/approach
Taking the Tang tomb murals collected by the Shaanxi History Museum as an example, based on clarifying the relevant concepts of Tang tomb mural resources and considering both dynamic and static dimensions, a top-down approach was adopted to first construct an ontology model of Tang tomb mural type cultural relics resources. Then, the actual case data was imported into the Neo4J graph database according to the defined pattern hierarchy to complete the static organization of knowledge, and presented in a multimodal form in knowledge reasoning and retrieval. In addition, geographic information system (GIS) technology is used to dynamically display the spatiotemporal distribution of Tang tomb mural resources, and the distribution trend is analysed from a digital humanistic perspective.
Findings
The multi-dimensional knowledge discovery of Tang tomb mural cultural relics resources can help establish the correlation and spatiotemporal relationship between resources, providing support for semantic retrieval and navigation, knowledge discovery and visualization and so on.
Originality/value
This study takes the murals in the collection of the Shaanxi History Museum as an example, revealing potential knowledge associations in a static and intelligent way, achieving knowledge discovery and management of Tang tomb murals, and dynamically presents the spatial distribution of Tang tomb murals through GIS technology, meeting the knowledge presentation needs of different users and opening up new ideas for the study of Tang tomb murals.
Details
Keywords
Pragati Agarwal, Sanjeev Swami and Sunita Kumari Malhotra
The purpose of this paper is to give an overview of artificial intelligence (AI) and other AI-enabled technologies and to describe how COVID-19 affects various industries such as…
Abstract
Purpose
The purpose of this paper is to give an overview of artificial intelligence (AI) and other AI-enabled technologies and to describe how COVID-19 affects various industries such as health care, manufacturing, retail, food services, education, media and entertainment, banking and insurance, travel and tourism. Furthermore, the authors discuss the tactics in which information technology is used to implement business strategies to transform businesses and to incentivise the implementation of these technologies in current or future emergency situations.
Design/methodology/approach
The review provides the rapidly growing literature on the use of smart technology during the current COVID-19 pandemic.
Findings
The 127 empirical articles the authors have identified suggest that 39 forms of smart technologies have been used, ranging from artificial intelligence to computer vision technology. Eight different industries have been identified that are using these technologies, primarily food services and manufacturing. Further, the authors list 40 generalised types of activities that are involved including providing health services, data analysis and communication. To prevent the spread of illness, robots with artificial intelligence are being used to examine patients and give drugs to them. The online execution of teaching practices and simulators have replaced the classroom mode of teaching due to the epidemic. The AI-based Blue-dot algorithm aids in the detection of early warning indications. The AI model detects a patient in respiratory distress based on face detection, face recognition, facial action unit detection, expression recognition, posture, extremity movement analysis, visitation frequency detection, sound pressure detection and light level detection. The above and various other applications are listed throughout the paper.
Research limitations/implications
Research is largely delimited to the area of COVID-19-related studies. Also, bias of selective assessment may be present. In Indian context, advanced technology is yet to be harnessed to its full extent. Also, educational system is yet to be upgraded to add these technologies potential benefits on wider basis.
Practical implications
First, leveraging of insights across various industry sectors to battle the global threat, and smart technology is one of the key takeaways in this field. Second, an integrated framework is recommended for policy making in this area. Lastly, the authors recommend that an internet-based repository should be developed, keeping all the ideas, databases, best practices, dashboard and real-time statistical data.
Originality/value
As the COVID-19 is a relatively recent phenomenon, such a comprehensive review does not exist in the extant literature to the best of the authors’ knowledge. The review is rapidly emerging literature on smart technology use during the current COVID-19 pandemic.
Details
Keywords
Ernesto William De Luca, Francesca Fallucchi, Bouchra Ghattas and Riem Spielhaus
This article aims to explore how the mapping strategies between user requirements expressed by the humanities researchers lead to a better customization of user-driven digital…
Abstract
Purpose
This article aims to explore how the mapping strategies between user requirements expressed by the humanities researchers lead to a better customization of user-driven digital humanities tools and to the creation of innovative functionalities, which can directly affect the way of doing research in a digital context.
Design/methodology/approach
It describes the user-driven development of a tool that helps researchers in the quantitative and qualitative analysis of large textbook collections.
Findings
This article presents an exemplary user journey map, which shows the different steps of the digital transformation process and how the humanities researchers are involved for (1) producing innovative research solutions, comprehensive and personalized reports, and (2) customizing access to content data used for the analysis of digital documents. The article is based on a case study on a German textbooks collection and content analysis functionalities.
Originality/value
The focus of this article is the reiterative research process, in which humanists (from the human centred point of view) starts from an initial research question, using quantitative and qualitative data and develops both the research question and the answers to it by with the aim to find patterns in the content and structure of educational media. Thus, from the viewpoint of digital transformation the humanist is part of the interaction between digitization and digitalization processes, where he/she uses digital data, metadata, reports and findings created and supported by the digital tools for research analysis.
Details
Keywords
Priya Garg and Shivarama Rao K.
This paper aims to discuss the process of building a 24×7 reference platform for facilitating the farmers with the easy access of information at any time from any location. It…
Abstract
Purpose
This paper aims to discuss the process of building a 24×7 reference platform for facilitating the farmers with the easy access of information at any time from any location. It takes the text string as input and process it to respond with the desired result to the user.
Design/methodology/approach
An interactive Web-based chatbot named as AgriRef was developed using free version of Dialogflow. The intents were defined based on the conversation flow diagram. Furthermore, the application was integrated with website on local server and telegram application.
Findings
With this chatbot application, the farmers will able to get answers of their queries. It provides the human-like conversational interface to the farmers. It will also be useful for librarians of agricultural libraries to save time in answering common queries.
Originality/value
This paper describes the various steps involved in developing the chatbot application using Dialogflow.
Details
Keywords
This study aims to identify European positioning on the use of remote customer onboarding solutions in combating financial crime.
Abstract
Purpose
This study aims to identify European positioning on the use of remote customer onboarding solutions in combating financial crime.
Design/methodology/approach
This study is a desktop research that examines European Banking Authority (EBA) policy statements relating to the use of innovative solutions in combating financial crime.
Findings
Technological advancements in biometric data and software tools provide a unique opportunity to address potential paper customer onboarding process deficiencies. Electronic remote customer onboarding solutions equip credit, financial institutions and investment firms with an alternative FTE cost-saving solution, in their pursuit of revenue generation. Whilst the EBA and Financial Action Task Force have provided approval for the utilisation of innovative solutions and AML technologies in combatting financial crime. Hesitancy remains on the ability of credit and financial institutions to use technological solutions as a “magic solution” in preventing the materialisation of money laundering/terrorist financing related risks. Analysis of policy suggests a gravitation towards the increased use of the aforementioned technologies in the interim.
Originality/value
Capitalisation of European banking authority.
Details