Search results

1 – 10 of over 4000
To view the access options for this content please click here
Article
Publication date: 20 December 2017

Arash Joorabchi and Abdulhussain E. Mahdi

Linking libraries and Wikipedia can significantly improve the quality of services provided by these two major silos of knowledge. Such linkage would enrich the quality of…

Abstract

Purpose

Linking libraries and Wikipedia can significantly improve the quality of services provided by these two major silos of knowledge. Such linkage would enrich the quality of Wikipedia articles and at the same time increase the visibility of library resources. To this end, the purpose of this paper is to describe the design and development of a software system for automatic mapping of FAST subject headings, used to index library materials, to their corresponding articles in Wikipedia.

Design/methodology/approach

The proposed system works by first detecting all the candidate Wikipedia concepts (articles) occurring in the titles of the books and other library materials which are indexed with a given FAST subject heading. This is then followed by training and deploying a machine learning (ML) algorithm designed to automatically identify those concepts that correspond to the FAST heading. In specific, the ML algorithm used is a binary classifier which classifies the candidate concepts into either “corresponding” or “non-corresponding” categories. The classifier is trained to learn the characteristics of those candidates which have the highest probability of belonging to the “corresponding” category based on a set of 14 positional, statistical, and semantic features.

Findings

The authors have assessed the performance of the developed system using standard information retrieval measures of precision, recall, and F-score on a data set containing 170 FAST subject headings manually mapped to their corresponding Wikipedia articles. The evaluation results show that the developed system is capable of achieving F-scores as high as 0.65 and 0.99 in the corresponding and non-corresponding categories, respectively.

Research limitations/implications

The size of the data set used to evaluate the performance of the system is rather small. However, the authors believe that the developed data set is large enough to demonstrate the feasibility and scalability of the proposed approach.

Practical implications

The sheer size of English Wikipedia makes the manual mapping of Wikipedia articles to library subject headings a very labor-intensive and time-consuming task. Therefore, the aim is to reduce the cost of such mapping and integration.

Social implications

The proposed mapping paves the way for connecting libraries and Wikipedia as two major silos of knowledge, and enables the bi-directional movement of users between the two.

Originality/value

To the best of the authors’ knowledge, the current work is the first attempt at automatic mapping of Wikipedia to a library-controlled vocabulary.

Details

Library Hi Tech, vol. 36 no. 1
Type: Research Article
ISSN: 0737-8831

Keywords

To view the access options for this content please click here
Article
Publication date: 25 March 2020

Wei Yu and Junpeng Chen

The purpose of this paper is to explore the potential of enriching the library subject headings with folksonomy for enhancing the visibility and usability of the library…

Abstract

Purpose

The purpose of this paper is to explore the potential of enriching the library subject headings with folksonomy for enhancing the visibility and usability of the library subject headings.

Design/methodology/approach

The WorldCat-million data set and SocialBM0311 are preprocessing and over 210,000 library catalog records and 124,482 non-repeating tags were adopted to construct the matrix to observe the semantic relation between library subject headings and folksonomy. The proposed system is compared with the state-of-the-art methods and the parameters are fixed to obtain effective performance.

Findings

The results demonstrate that by integrating different semantic relations from library subject headings and folksonomy, the system’s performance can be improved compared to the benchmark methods. The evaluation results also show that the folksonomy can enrich library subject headings through the semantic relationship.

Originality/value

The proposed method simultaneous weighted matrix factorization can integrate the semantic relation from the library subject headings and folksonomy into one semantic space. The observation of the semantic relation between library subject headings and social tags from folksonomy can help enriching the library subject headings and improving the visibility of the library subject headings.

Details

The Electronic Library, vol. 38 no. 2
Type: Research Article
ISSN: 0264-0473

Keywords

To view the access options for this content please click here
Article
Publication date: 18 November 2013

Arash Joorabchi and Abdulhussain E. Mahdi

This paper aims to report on the design and development of a new approach for automatic classification and subject indexing of research documents in scientific digital…

Abstract

Purpose

This paper aims to report on the design and development of a new approach for automatic classification and subject indexing of research documents in scientific digital libraries and repositories (DLR) according to library controlled vocabularies such as DDC and FAST.

Design/methodology/approach

The proposed concept matching-based approach (CMA) detects key Wikipedia concepts occurring in a document and searches the OPACs of conventional libraries via querying the WorldCat database to retrieve a set of MARC records which share one or more of the detected key concepts. Then the semantic similarity of each retrieved MARC record to the document is measured and, using an inference algorithm, the DDC classes and FAST subjects of those MARC records which have the highest similarity to the document are assigned to it.

Findings

The performance of the proposed method in terms of the accuracy of the DDC classes and FAST subjects automatically assigned to a set of research documents is evaluated using standard information retrieval measures of precision, recall, and F1. The authors demonstrate the superiority of the proposed approach in terms of accuracy performance in comparison to a similar system currently deployed in a large scale scientific search engine.

Originality/value

The proposed approach enables the development of a new type of subject classification system for DLR, and addresses some of the problems similar systems suffer from, such as the problem of imbalanced training data encountered by machine learning-based systems, and the problem of word-sense ambiguity encountered by string matching-based systems.

To view the access options for this content please click here
Article
Publication date: 18 September 2017

Scott Hanrath and Erik Radio

The purpose of this paper is to investigate the search behavior of institutional repository (IR) users in regard to subjects as a means of estimating the potential impact…

Abstract

Purpose

The purpose of this paper is to investigate the search behavior of institutional repository (IR) users in regard to subjects as a means of estimating the potential impact of applying a controlled subject vocabulary to an IR.

Design/methodology/approach

Google Analytics data were used to record cases where users arrived at an IR item page from an external web search and subsequently downloaded content. Search queries were compared against the Faceted Application of Subject Terminology (FAST) schema to determine the topical nature of the queries. Queries were also compared against the item’s metadata values for title and subject using approximate string matching to determine the alignment of the queries with current metadata values.

Findings

A substantial portion of successful user search queries to an IR appear to be topical in nature. User search queries matched values from FAST at a higher rate than existing subject metadata. Increased attention to subject description in IR records may provide an opportunity to improve the search visibility of the content.

Research limitations/implications

The study is limited to a particular IR. Data from Google Analytics does not provide comprehensive search query data.

Originality/value

The study presents a novel method for analyzing user search behavior to assist IR managers in determining whether to invest in applying controlled subject vocabularies to IR content.

Details

Library Hi Tech, vol. 35 no. 3
Type: Research Article
ISSN: 0737-8831

Keywords

To view the access options for this content please click here
Article
Publication date: 14 October 2013

Paul Ojennus and Joseph Timothy Tennis

The purpose of this paper is to propose a theoretical framework, based on contemporary philosophical aesthetics, from which principled assessments of the aesthetic value…

Abstract

Purpose

The purpose of this paper is to propose a theoretical framework, based on contemporary philosophical aesthetics, from which principled assessments of the aesthetic value of information organization frameworks may be conducted.

Design/methodology/approach

This paper identifies appropriate discourses within the field of philosophical aesthetics, constructs from them a framework for assessing aesthetic properties of information organization frameworks. This framework is then applied in two case studies examining the Library of Congress Subject Headings (LCSH), and Sexual Nomenclature: A Thesaurus.

Findings

In both information organization frameworks studied, the aesthetic analysis was useful in identifying judgments of the frameworks as aesthetic judgments, in promoting discovery of further areas of aesthetic judgments, and in prompting reflection on the nature of these aesthetic judgments.

Research limitations/implications

This study provides proof-of-concept for the aesthetic evaluation of information organization frameworks. Areas of future research are identified as the role of cultural relativism in such aesthetic evaluation and identification of appropriate aesthetic properties of information organization frameworks.

Practical implications

By identifying a subset of judgments of information organization frameworks as aesthetic judgments, aesthetic evaluation of such frameworks can be made explicit and principled. Aesthetic judgments can be separated from questions of economic feasibility, functional requirements, and user-orientation. Design and maintenance of information organization frameworks can be based on these principles.

Originality/value

This study introduces a new evaluative axis for information organization frameworks based on philosophical aesthetics. By improving the evaluation of such novel frameworks, design and maintenance can be guided by these principles.

Details

Journal of Documentation, vol. 69 no. 6
Type: Research Article
ISSN: 0022-0418

Keywords

To view the access options for this content please click here
Article
Publication date: 3 June 2014

Lucas Mak, Devin Higgins, Aaron Collie and Shawn Nicholson

The purpose of this paper is to illustrate that Electronic Theses and Dissertation (ETD) metadata can be used as data for institutional assessment and to map an extended…

Abstract

Purpose

The purpose of this paper is to illustrate that Electronic Theses and Dissertation (ETD) metadata can be used as data for institutional assessment and to map an extended research landscape when connected to other data sets through linked data models.

Design/methodology/approach

This paper presents conceptual consideration of ideas behind linked data architecture to leverage ETD and attendant metadata to build a case for institutional assessment. Analysis of graph data support the considerations.

Findings

The study reveals first and foremost that ETD metadata is in itself data. Concerns with creating URIs for data elements and general applicability of linked data model formation result. The analysis positively points up a rich environment of institutional relationships not readily found in traditional flat metadata records.

Originality/value

This paper provides a new perspective in examining research landscape through ETDs produced by graduate students in higher education sector.

Details

Library Management, vol. 35 no. 4/5
Type: Research Article
ISSN: 0143-5124

Keywords

To view the access options for this content please click here
Article
Publication date: 14 July 2020

Andrea Cuna and Gabriele Angeli

This paper puts forward a MARC-based semiautomated approach to extracting semantically rich subject facets from general and/or specialized controlled vocabularies for…

Abstract

Purpose

This paper puts forward a MARC-based semiautomated approach to extracting semantically rich subject facets from general and/or specialized controlled vocabularies for display in topic-oriented faceted catalog interfaces in a way that would better support users' exploratory search tasks.

Design/methodology/approach

Hierarchical faceted subject metadata is extracted from general and/or specialized controlled vocabularies by using standard client/server communication protocols. Rigorous facet analysis, classification and linguistic principles are applied on top of that to ensure faceting accuracy and consistency.

Findings

A shallow application of facet analysis and classification, together with poorly organized displays, is one of the major barriers to effective faceted navigation in library, archive and museum catalogs.

Research limitations/implications

This paper does not deal with Web-scale discovery services.

Practical implications

This paper offers suggestions that can be used by the technical services departments of libraries, archives and museums in designing and developing more powerful exploratory search interfaces.

Originality/value

This paper addresses the problem of deriving clearly delineated topical facets from existing metadata for display in a user-friendly, high-level topical overview that is meant to encourage a multidimensional exploration of local collections as well as “learning by browsing.”

Details

Library Hi Tech, vol. 39 no. 2
Type: Research Article
ISSN: 0737-8831

Keywords

To view the access options for this content please click here
Article
Publication date: 4 September 2009

Alenka Šauperl

This paper aims to discuss some long‐standing issues of the development of a subject heading language as pre‐ or postcoordinated.

Abstract

Purpose

This paper aims to discuss some long‐standing issues of the development of a subject heading language as pre‐ or postcoordinated.

Design/methodology/approach

In a review of literature on pre‐ and postcoordination and user behaviour, 20 criteria originally discussed by Svenonius are considered.

Findings

The advantages and disadvantages of pre‐ and postcoordinated systems are on a very similar level. Most subject heading languages developed recently are precoordinated. They all require investments in highly skilled intellectual work, and are therefore expensive and difficult to maintain. Postcoordinated systems seem to have more advantages for information providers, but less for users. However, most of these disadvantages could be overcome by known information retrieval models and techniques.

Research limitations/implications

The criteria originally discussed by Svenonius are difficult to evaluate in an exact manner. Some of them are also irrelevant because of changes in information retrieval systems.

Practical implications

It was found that the decision on whether to use a pre‐ or postcoordinated system cannot be taken independent of consideration of the subject authority file and the functions of an information retrieval system, which should support users on one hand and information providers and indexers on the other.

Originality/value

This literature review brings together some findings that have not been considered together previously.

Details

Journal of Documentation, vol. 65 no. 5
Type: Research Article
ISSN: 0022-0418

Keywords

To view the access options for this content please click here
Article
Publication date: 2 October 2009

Ioannis Papadakis, Michalis Stefanidakis and Aikaterini Tzali

The purpose of this paper is to address a library service based on semantic web technologies capable of exposing knowledge that is otherwise hidden in a library's subject

Abstract

Purpose

The purpose of this paper is to address a library service based on semantic web technologies capable of exposing knowledge that is otherwise hidden in a library's subject headings repository.

Design/methodology/approach

The proposed service implements a web‐based information seeking process that combines browsing and searching of information assets within a library, based on their corresponding subject headings. The underlying subject headings hierarchy is the Greek translation of a subset of the official Library of Congress subject headings. The information seeking process exposes the expressiveness of an underlying ontology capable of modeling subject headings together with their relations.

Findings

In order to assess the effectiveness of the proposed approach in a real‐world scenario, the library service is integrated into a working OPAC located at the Ionian University in Greece. Thus, the library service contributes to the fast retrieval of information. Moreover, during the information seeking process, users underpin their cognitive learning.

Originality/value

The paper introduces a novel service for the library domain capable of being integrated in many library portals. Serving as a semantic web application, the proposed work promotes interactive navigation in ontology structures that could be potentially exploited by ontologies developed in other domains.

Details

The Electronic Library, vol. 27 no. 5
Type: Research Article
ISSN: 0264-0473

Keywords

To view the access options for this content please click here
Article
Publication date: 1 March 1985

CHIH WANG

INTRODUCTION Computers and new information technologies have beyond question brought tremendous advancement in information storage and retrieval. In recent years, the…

Abstract

INTRODUCTION Computers and new information technologies have beyond question brought tremendous advancement in information storage and retrieval. In recent years, the traditional card catalog has given way first to the COM (computer output on microform) catalog, then to the online catalog. Now, many libraries are shifting to the new capability in order to provide better and faster services to their patrons.

Details

Library Review, vol. 34 no. 3
Type: Research Article
ISSN: 0024-2535

1 – 10 of over 4000