Search results

1 – 10 of 299
Article
Publication date: 1 December 2003

Laura Tull and Dona Straley

Unicode is a standard for a universal character set for all of the scripts of the world’s languages It is one of the fundamental technological building blocks for international…

811

Abstract

Unicode is a standard for a universal character set for all of the scripts of the world’s languages It is one of the fundamental technological building blocks for international exchange of textual information, via computers. It is particularly important to libraries that house collections in many languages and written in various scripts. Ohio State University Libraries houses collections written in many non‐Latin scripts including Arabic, Hebrew, Chinese, Japanese, Korean and Cyrillic. Providing patrons with access to these materials has become much easier with the incorporation of Unicode into library systems and software. This article describes what needs to be in place on personal computers and in the library system in order to take advantage of Unicode as well as providing some guidelines to troubleshoot problems when they occur.

Details

Library Hi Tech, vol. 21 no. 4
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 22 August 2008

Robert Fox

This column aims to examine the role of text encoding, specifically Unicode, in modern digital library applications. It also seeks to examine the technical aspects of Unicode and…

366

Abstract

Purpose

This column aims to examine the role of text encoding, specifically Unicode, in modern digital library applications. It also seeks to examine the technical aspects of Unicode and how they impact those applications.

Design/methodology/approach

This column is simply exploratory, and examines issues regarding Unicode, text encoding in general, and what librarians should be aware of concerning technical aspects of the Unicode standard.

Findings

The paper finds that Unicode is the bedrock of all metadata in modern digital library applications. An awareness of how to identify and perform conversions on that data is critical to the support of these applications. Many aspects of functionality in forthcoming information resource tools will rely upon a familiarity with the technical aspects of Unicode.

Originality/value

This column explains the salient technical features of Unicode for those who may not be familiar with its inner workings. It also takes into account the impact text encoding has on the functionality of modern library applications.

Details

OCLC Systems & Services: International digital library perspectives, vol. 24 no. 3
Type: Research Article
ISSN: 1065-075X

Keywords

Article
Publication date: 1 October 2004

Rajesh Chandrakar

India is a country rich in diversity in languages, cultures, customs and religions. Records of this complete culture, secret manuscripts and related documents of the respective…

934

Abstract

India is a country rich in diversity in languages, cultures, customs and religions. Records of this complete culture, secret manuscripts and related documents of the respective religions, and 3,000 years of Indian history are available in their respective languages in different museums and libraries across the country. When the automation of libraries started in India, immediately the issue of localization of library and museum databases emerged. The issue became even more apparent with the advent of digital libraries and interoperability. At the start of automation, in the absence of proper standards, professionals tried to romanize documents as computers used to accept only binary digits of roman script to represent the English language. Later, the development of a new technology, ISCII, which is an extended form of the ASCII values from 126 to 255, helped library professionals in either developing the bilingual bibliographic databases or bilingual text files on DOS or Unix based applications. Gradually the font for Windows‐based applications was developed for creating Web sites or document files. But now, with the requirement of different languages in the world including Indian, there is a forum available called “Unicode, Inc.” which provides a solution to the localization problem of the world's languages. In this paper, Unicode as a multilingual standard is explained and the related technology available for localizing the Indian language materials is discussed.

Details

The Electronic Library, vol. 22 no. 5
Type: Research Article
ISSN: 0264-0473

Keywords

Article
Publication date: 1 August 2002

Rajesh Chandrakar

In this Internet era, when everything is going on the Web, character encoding becomes an issue for developers and facilitators. This matter also concerns India, as a country with…

612

Abstract

In this Internet era, when everything is going on the Web, character encoding becomes an issue for developers and facilitators. This matter also concerns India, as a country with rich diversity in languages, cultures, customs and religions, which are stored in print media as manuscripts, monographs, pamphlet, tamra‐patras (copper plates), palm leaves etc. Library and information networks in India hold the responsibility to digitise all those valuable resources stored in print media and make them accessible to users through the Web. However, due to technology limitations, so far it has not been practical to do so. This paper tries to explain the limitations and problems being faced in this regard, highlights the issues involved with multi‐script database creation and the required state‐of‐the‐art technology. Finally, Unicode is considered as a solution as it is a Universal Character Set for character encoding.

Details

Online Information Review, vol. 26 no. 4
Type: Research Article
ISSN: 1468-4527

Keywords

Article
Publication date: 1 December 1997

C. Clissman, R. Murray, E. Davidson, J. Hands, O. Sijtsma, A. Noordzij, R. Moulton, S. Shanawa, J. Darzentas and I. Pettman

Provides a brief introduction to the UNIverse Project and its major objectives. Gives an overview of the international standards, softwares and systems which will enable…

Abstract

Provides a brief introduction to the UNIverse Project and its major objectives. Gives an overview of the international standards, softwares and systems which will enable bibliographic searching of multiple distributed library catalogues including the Z39.50 standard, WWW gateways, the EUROPAGATE project, the Java programming language and the Unicode World‐wide Character Encoding Standard.

Details

New Library World, vol. 98 no. 7
Type: Research Article
ISSN: 0307-4803

Keywords

Article
Publication date: 1 December 1997

Janet C. Erickson

The use of Unicode/10646 will increase dramatically in the coming few years for many reasons. Use of the Internet has increased awareness of the problems inherent in the use of…

Abstract

The use of Unicode/10646 will increase dramatically in the coming few years for many reasons. Use of the Internet has increased awareness of the problems inherent in the use of multiple character set standards. It has also increased the potential audience for HTI‐created texts. The standard was created by and has been embraced in the computer industry. Many products have come on the market with support for some or all of the standard; Microsoft‘s use of UTF‐8 and Unicode, especially in Windows NT, is a boost to the mainstreaming of such products.

Details

Library Hi Tech, vol. 15 no. 3/4
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 13 February 2009

Qing Zou and Guoying Liu

The purpose of this paper is to investigate various issues related to Chinese language localisation in Evergreen, an open source integrated library system (ILS).

Abstract

Purpose

The purpose of this paper is to investigate various issues related to Chinese language localisation in Evergreen, an open source integrated library system (ILS).

Design/methodology/approach

A Simplified Chinese version of Evergreen was implemented and tested and various issues such as encoding, indexing, searching, and sorting specifically associated with Simplified Chinese language were investigated.

Findings

The paper finds that Unicode eases a lot of ILS development problems. However, having another language version of an ILS does not simply require the translation from one language to another. Indexing, searching, sorting and other locale related issues should be tackled not only language by language, but locale by locale.

Practical implications

Most of the issues that have arisen during this project will be found with other ILS‐like systems.

Originality/value

This paper provides insights into issues of, and various solutions to, indexing, searching, and sorting in the Chinese language in an ILS. These issues and the solutions may be applicable to other digital library systems such as institutional repositories.

Details

Program, vol. 43 no. 1
Type: Research Article
ISSN: 0033-0337

Keywords

Article
Publication date: 1 March 1992

Susanna Peruginelli, Giovanni Bergamin and Pino Ammendola

Character coding systems such as ASCII and EBCDIC are unable to deal with the worldwide range of characters and so possible solutions, such as ISO 10646 and Unicode, involving…

Abstract

Character coding systems such as ASCII and EBCDIC are unable to deal with the worldwide range of characters and so possible solutions, such as ISO 10646 and Unicode, involving 16‐bit codes have been suggested. The paper examines how libraries deal with multi‐lingual character sets and describes work being undertaken on the definition of a basic European character set as part of the National Bibliographies on CD‐ROM project being funded as part of the Commission of the European Communities Action Plan for Libraries.

Details

Program, vol. 26 no. 3
Type: Research Article
ISSN: 0033-0337

Article
Publication date: 4 September 2009

Devika P. Madalli and Dimple Patel

The purpose of this paper is to discuss the various issues involved in Indian languages computing, particularly Telugu, like creating, displaying, searching and retrieving digital…

Abstract

Purpose

The purpose of this paper is to discuss the various issues involved in Indian languages computing, particularly Telugu, like creating, displaying, searching and retrieving digital content. The paper also aims to emphasize the issues involved in retrieval in Indian languages. The complexities presented by the grammar, syntax and morphology of Indian languages are discussed.

Design/methodology/approach

The paper undertakes and presents descriptive study of the issues and challenges in Indian languages computing in general and Telugu language in particular.

Findings

The problem of multilingual information retrieval in Indian languages is multi‐pronged. A major observation of this study is that, though digital content is available in Indian languages, it is mostly in non‐standard encoding format and fonts. There is an urgent need to work in the area of developing search algorithms for Indian languages, like soundex and metaphones to tolerate spelling variations and mistakes that a user might make in queries and suggest correct spelling(s).

Practical implications

With existing technologies libraries can now build online catalogues in the language of the documents or build digital repositories with content in various Indian languages. Though a few library automation software like NewGenLib and digital library software like DSpace, etc. are offering Unicode support for Indian languages, they do not allow for different types of search such as truncation search, word variants, etc. The present study is a step towards developing algorithms for indexing and searching in Indian languages.

Originality/value

The paper addresses various issues in Indian language computing with emphasis on search and retrieval.

Details

Library Hi Tech, vol. 27 no. 3
Type: Research Article
ISSN: 0737-8831

Keywords

Article
Publication date: 1 September 2001

Mark Needleman, John Bodfish, Tony O’Brien, James E. Rush and Pat Stevens

Describes the NISO circulation interchange protocol (NCIP) and some of the design decisions that were made in developing it. When designing a protocol of the scale and scope of…

700

Abstract

Describes the NISO circulation interchange protocol (NCIP) and some of the design decisions that were made in developing it. When designing a protocol of the scale and scope of NCIP, certain decisions about what technologies to employ need to be made. Often there are multiple competing technologies that can be employed to accomplish the same functionality, and there are both positive and negative reasons for the choice of any particular one. Focuses specifically on the areas on which the protocol would be supported. Gives particular emphasis to the decision to choose XML as the encoding technology for the protocol messages. One of the main design goals for NCIP was to try to strike the appropriate balance between ease of implementation and providing appropriate functionality. This functionality includes that needed to support both application areas that the committee could anticipate would use the protocol in the short term, and new applications that might be developed in the future.

Details

Library Hi Tech, vol. 19 no. 3
Type: Research Article
ISSN: 0737-8831

Keywords

1 – 10 of 299