Search results

1 – 10 of over 1000
Article
Publication date: 1 March 1971

Andrew Robertson

The idea of optical character recognition (OCR), in other words the “reading” of documents by other than human means, arose as a practical proposition during the Second World War…

Abstract

The idea of optical character recognition (OCR), in other words the “reading” of documents by other than human means, arose as a practical proposition during the Second World War. Wartime experience of using computers in the United States had revealed the contrasts in speeds between the transcription of documents to be processed (at that time the punching of cards or tape by operatives working from original documents) and the central processing within the computer itself. Visual output was also slower than central processing but was much speeded up by the introduction of line printers and later of xerography. This “paired” case study, part of a project sponsored by the Science Research Council to examine patterns of success and failure in industrial innovation, is confined to two attempts to innovate in the field of OCR. There were others, one or two of which were contemporary, most of which have followed, have a much more recent history and may be thought to have overtaken, in terms of market penetration, the innovation here designated a commercial success. The point of this study when it was undertaken was to extract data about the two innovations that would be suitable for general analysis by a computer programme designed to search out significant groups of explanatory factors so that the characteristics associated with innovative success might be recognised as typical within an industry, or perhaps generally. This study belongs to one of two groups, the instrument industry, the other group investigated being chemical manufacturing.

Details

Management Decision, vol. 9 no. 3
Type: Research Article
ISSN: 0025-1747

Article
Publication date: 31 July 2020

Zainab Akhtar, Jong Weon Lee, Muhammad Attique Khan, Muhammad Sharif, Sajid Ali Khan and Naveed Riaz

In artificial intelligence, the optical character recognition (OCR) is an active research area based on famous applications such as automation and transformation of printed…

Abstract

Purpose

In artificial intelligence, the optical character recognition (OCR) is an active research area based on famous applications such as automation and transformation of printed documents into machine-readable text document. The major purpose of OCR in academia and banks is to achieve a significant performance to save storage space.

Design/methodology/approach

A novel technique is proposed for automated OCR based on multi-properties features fusion and selection. The features are fused using serially formulation and output passed to partial least square (PLS) based selection method. The selection is done based on the entropy fitness function. The final features are classified by an ensemble classifier.

Findings

The presented method was extensively tested on two datasets such as the authors proposed and Chars74k benchmark and achieved an accuracy of 91.2 and 99.9%. Comparing the results with existing techniques, it is found that the proposed method gives improved performance.

Originality/value

The technique presented in this work will help for license plate recognition and text conversion from a printed document to machine-readable.

Details

Journal of Enterprise Information Management, vol. 36 no. 3
Type: Research Article
ISSN: 1741-0398

Keywords

Open Access
Article
Publication date: 28 November 2017

Mansoor Alghamdi and William Teahan

The aim of this paper is to experimentally evaluate the effectiveness of the state-of-the-art printed Arabic text recognition systems to determine open areas for future…

6528

Abstract

Purpose

The aim of this paper is to experimentally evaluate the effectiveness of the state-of-the-art printed Arabic text recognition systems to determine open areas for future improvements. In addition, this paper proposes a standard protocol with a set of metrics for measuring the effectiveness of Arabic optical character recognition (OCR) systems to assist researchers in comparing different Arabic OCR approaches.

Design/methodology/approach

This paper describes an experiment to automatically evaluate four well-known Arabic OCR systems using a set of performance metrics. The evaluation experiment is conducted on a publicly available printed Arabic dataset comprising 240 text images with a variety of resolution levels, font types, font styles and font sizes.

Findings

The experimental results show that the field of character recognition for printed Arabic still requires further research to reach an efficient text recognition method for Arabic script.

Originality/value

To the best of the authors’ knowledge, this is the first work that provides a comprehensive automated evaluation of Arabic OCR systems with respect to the characteristics of Arabic script and, in addition, proposes an evaluation methodology that can be used as a benchmark by researchers and therefore will contribute significantly to the enhancement of the field of Arabic script recognition.

Details

PSU Research Review, vol. 1 no. 3
Type: Research Article
ISSN: 2399-1747

Keywords

Article
Publication date: 1 February 1987

Douglas K. Ferguson

The Fred Meyer Charitable Trust, Division of Library and Information Resources for the Northwest, has funded five research projects that will demonstrate the potential of various…

Abstract

The Fred Meyer Charitable Trust, Division of Library and Information Resources for the Northwest, has funded five research projects that will demonstrate the potential of various techniques and new technologies to facilitate communications and resource sharing in the Northwest. The experience and information derived from these projects will be of value to all libraries and information centers, not just those conducting the research. The techniques and technologies being evaluated include: simultaneous remote searching, which uses inexpensive terminals and modems; a mini‐computer‐based union list and resource sharing network (INFONET); networks using facsimile machines; networks that transmit documents that have been optically scanned into bit‐map image files; and use of optical character recognition equipment to capture ASCII machine‐readable information that can be broadcast by television stations to user‐sites. Contributors of reports are: Verl Anderson, Linda Brander, Millard F. Johnson, Jr., Bruce Morton, and Steve Smith. Summary observations are provided by Joseph R. Matthews.

Details

Library Hi Tech, vol. 5 no. 2
Type: Research Article
ISSN: 0737-8831

Article
Publication date: 3 June 2014

Jim Hahn

The purpose of this paper is to report results of a formative usability study that investigated first-year student use of an optical character recognition (OCR) mobile application…

1078

Abstract

Purpose

The purpose of this paper is to report results of a formative usability study that investigated first-year student use of an optical character recognition (OCR) mobile application (app) designed to help students find resources for course assignments. The app uses textual content from the assignment sheet to suggest relevant library resources of which students may not be aware.

Design/methodology/approach

Formative evaluation data are collected to inform the production level version of the mobile application and to understand student use models and requirements for OCR software in mobile applications.

Findings

Mobile OCR apps are helpful for undergraduate students searching known titles of books, general subject areas or searching for help guide content developed by the library. The results section details how student feedback shaped the next iteration of the app for integration as a Minrva module.

Research limitations/implications

This usability paper is not a large-scale quantitative study, but seeks to provide deep qualitative research data for the specific mobile interface studied, the Text-shot prototype.

Practical implications

The OCR application is designed to help students learn about availability of library resources based on scanning (e.g. taking a picture, or “Text-shot”) of an assignment sheet, a course syllabus or other course-related handouts.

Originality/value

This study contributes a new area of application development for libraries, with research methods that are useful for other mobile development studies.

Details

Reference Services Review, vol. 42 no. 2
Type: Research Article
ISSN: 0090-7324

Keywords

Article
Publication date: 1 March 1997

Howard Falk

Most pages of text or graphical materials in a library are black‐and‐white and most of the work that a scanner will do involves black‐and‐white images. Yet, it makes sense to buy…

Abstract

Most pages of text or graphical materials in a library are black‐and‐white and most of the work that a scanner will do involves black‐and‐white images. Yet, it makes sense to buy a scanner capable of processing colour images. The difference in cost between a colour scanner and a black‐and‐white unit is relatively small, and the colour scanner allows colour pages to be converted into computer images whenever needed. In response to the demand by computer users for the ability to handle colour, new models of scanning equipment are almost all equipped to do so.

Details

The Electronic Library, vol. 15 no. 3
Type: Research Article
ISSN: 0264-0473

Article
Publication date: 1 March 1985

Martin Harrison

An overview of the OPTIRAM/LIBPAC computerised system for the intelligent optical scanning of catalogue cards, or any other form of printed or good hand‐written material, into a…

Abstract

An overview of the OPTIRAM/LIBPAC computerised system for the intelligent optical scanning of catalogue cards, or any other form of printed or good hand‐written material, into a full MARC format is given. The article provides information on the sophisticated scanning technology employed, using standard Group 3 facsimile transmission devices to read catalogue card entries and produce an internally coded data string, used to drive format recognition programs developed by LIBPAC, each tailored to suit a particular application. Sections deal with the varied aspects of this individual approach and the benefits that can arise from taking advantage of user‐specific software to enhance and standardise the resulting machine‐readable catalogue. The article includes examples which show the full capabilities of the optical scanner and examples of catalogue cards that have been converted into the MARC format.

Details

Program, vol. 19 no. 3
Type: Research Article
ISSN: 0033-0337

Article
Publication date: 1 January 1989

Clyde W. Grotophorst

Optical character recognition (OCR) technology can be employed to produce an ASCII‐text database for mounting on computer systems. Current technologies and principles of scanning…

Abstract

Optical character recognition (OCR) technology can be employed to produce an ASCII‐text database for mounting on computer systems. Current technologies and principles of scanning and OCR are discussed. A prototypical “local” project—the creation of a full‐text database of dissertations done at George Mason University—has been undertaken by the Fenwick Library at that institution. Problems encountered with current scanning and OCR technologies are illustrated and discussed, as well as techniques and “filter” programs developed to streamline the scanning and OCR conversion process.

Details

Library Hi Tech, vol. 7 no. 1
Type: Research Article
ISSN: 0737-8831

Article
Publication date: 17 July 2020

Hrvoje Stančić and Željko Trbušić

The authors investigate optical character recognition (OCR) technology and discuss its implementation in the context of digitisation of archival materials.

Abstract

Purpose

The authors investigate optical character recognition (OCR) technology and discuss its implementation in the context of digitisation of archival materials.

Design/methodology/approach

The typewritten transcripts of the Croatian Writers' Society from the mid-60s of the 20th century are used as the test data. The optimal digitisation setup is investigated in order to obtain the best OCR results. This was done by using the sample of 123 pages digitised at different resolution settings and binarisation levels.

Findings

A series of tests showed that different settings produce significantly different results. The best OCR accuracy achieved at the test sample of the typewritten documents was 95.02%. The results show that the resolution is significantly more important than binarisation pre-processing procedure for achieving better OCR results.

Originality/value

Based on the research results, the authors give recommendations for achieving optimal digitisation process setup with the aim of increasing the quality of OCR results. Finally, the authors put the research results in the context of digitisation of cultural heritage in general and discuss further investigation possibilities.

Details

Aslib Journal of Information Management, vol. 72 no. 4
Type: Research Article
ISSN: 2050-3806

Keywords

Open Access
Article
Publication date: 23 May 2023

Kimmo Kettunen, Heikki Keskustalo, Sanna Kumpulainen, Tuula Pääkkönen and Juha Rautiainen

This study aims to identify user perception of different qualities of optical character recognition (OCR) in texts. The purpose of this paper is to study the effect of different…

Abstract

Purpose

This study aims to identify user perception of different qualities of optical character recognition (OCR) in texts. The purpose of this paper is to study the effect of different quality OCR on users' subjective perception through an interactive information retrieval task with a collection of one digitized historical Finnish newspaper.

Design/methodology/approach

This study is based on the simulated work task model used in interactive information retrieval. Thirty-two users made searches to an article collection of Finnish newspaper Uusi Suometar 1869–1918 which consists of ca. 1.45 million autosegmented articles. The article search database had two versions of each article with different quality OCR. Each user performed six pre-formulated and six self-formulated short queries and evaluated subjectively the top 10 results using a graded relevance scale of 0–3. Users were not informed about the OCR quality differences of the otherwise identical articles.

Findings

The main result of the study is that improved OCR quality affects subjective user perception of historical newspaper articles positively: higher relevance scores are given to better-quality texts.

Originality/value

To the best of the authors’ knowledge, this simulated interactive work task experiment is the first one showing empirically that users' subjective relevance assessments are affected by a change in the quality of an optically read text.

Details

Journal of Documentation, vol. 79 no. 7
Type: Research Article
ISSN: 0022-0418

Keywords

1 – 10 of over 1000