Search results

1 – 10 of 38
Article
Publication date: 18 January 2011

Alan Vaughan Hughes and Pauline Rafferty

This paper seeks to report a project to investigate the degree of inter‐indexer consistency in the assignment of controlled vocabulary topical subject index terms to identical…

1954

Abstract

Purpose

This paper seeks to report a project to investigate the degree of inter‐indexer consistency in the assignment of controlled vocabulary topical subject index terms to identical graphical images by different indexers at the National Library of Wales (NLW).

Design/methodology/approach

An experimental quantitative methodology was devised to investigate inter‐indexer consistency. Additionally, the project investigated the relationship, if any, between indexing exhaustivity and consistency, and the relationship, if any, between indexing consistency/exhaustivity and broad category of graphic format.

Findings

Inter‐indexer consistency in the assignment of topical subject index terms to graphic materials at the NLW was found to be generally low and highly variable. Inter‐indexer consistency fell within the range 10.8 per cent to 48.0 per cent. Indexing exhaustivity varied substantially from indexer to indexer, with a mean assignment of 3.8 terms by each indexer to each image, falling within the range 2.5 to 4.7 terms. The broad category of graphic format, whether photographic or non‐photographic, was found to have little influence on either inter‐indexer consistency or indexing exhaustivity. Indexing exhaustivity and inter‐indexer consistency exhibited a tendency toward a direct, positive relationship. The findings are necessarily limited as this is a small‐scale study within a single institution.

Originality/value

Previous consistency studies have almost exclusively investigated the indexing of print materials, with very little research published for non‐print media. With the literature also rich in discussion of the added complexities of subjectively representing the intellectual content of visual media, this study attempts to enrich existing knowledge on indexing consistency for graphic materials and to address a noticeable gap in information theory.

Details

Journal of Documentation, vol. 67 no. 1
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 6 May 2014

Hollie White, Craig Willis and Jane Greenberg

The purpose of this paper is to examine the effect of the Helping Interdisciplinary Vocabulary Engineering (HIVE) system on the inter-indexer consistency of information…

Abstract

Purpose

The purpose of this paper is to examine the effect of the Helping Interdisciplinary Vocabulary Engineering (HIVE) system on the inter-indexer consistency of information professionals when assigning keywords to a scientific abstract. This study examined first, the inter-indexer consistency of potential HIVE users; second, the impact HIVE had on consistency; and third, challenges associated with using HIVE.

Design/methodology/approach

A within-subjects quasi-experimental research design was used for this study. Data were collected using a task-scenario based questionnaire. Analysis was performed on consistency results using Hooper's and Rolling's inter-indexer consistency measures. A series of t-tests was used to judge the significance between consistency measure results.

Findings

Results suggest that HIVE improves inter-indexing consistency. Working with HIVE increased consistency rates by 22 percent (Rolling's) and 25 percent (Hooper's) when selecting relevant terms from all vocabularies. A statistically significant difference exists between the assignment of free-text keywords and machine-aided keywords. Issues with homographs, disambiguation, vocabulary choice, and document structure were all identified as potential challenges.

Research limitations/implications

Research limitations for this study can be found in the small number of vocabularies used for the study. Future research will include implementing HIVE into the Dryad Repository and studying its application in a repository system.

Originality/value

This paper showcases several features used in HIVE system. By using traditional consistency measures to evaluate a semantic web technology, this paper emphasizes the link between traditional indexing and next generation machine-aided indexing (MAI) tools.

Details

Journal of Documentation, vol. 70 no. 3
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 1 February 1994

DAVID ELLIS, JONATHAN FURNER‐HINES and PETER WILLETT

An important stage in the process of retrieval of objects from a hypertext database is the creation of a set of inter‐nodal links that are intended to represent the relationships…

Abstract

An important stage in the process of retrieval of objects from a hypertext database is the creation of a set of inter‐nodal links that are intended to represent the relationships existing between objects; this operation is often undertaken manually, just as index terms are often manually assigned to documents in a conventional retrieval system. Studies of conventional systems have suggested that a degree of consistency in the terms assigned to documents by indexers is positively associated with retrieval effectiveness. It is thus of interest to investigate the consistency of assignment of links in separate hypertext versions of the same full‐text document, since a measure of agreement may be related to the subsequent utility of the resulting hypertext databases. The calculation of values indicating the degree of similarity between objects is a technique that has been widely used in the fields of textual and chemical information retrieval; in this paper, we describe the application of arithmetic coefficients and topological indices to the measurement of the degree of similarity between the sets of inter‐nodal links in hypertext databases. We publish the results of a study in which several different sets of links are inserted, by different people, between the paragraphs of each of a number of full‐text documents. Our results show little similarity between the sets of links identified by different people; this finding is comparable with those of studies of inter‐indexer consistency, where it has been found that there is generally only a low level of agreement between the sets of index terms assigned to a document by different indexers.

Details

Journal of Documentation, vol. 50 no. 2
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 April 1993

In the third paragraph, the author states that ‘Conventional text retrieval systems suffer from a number of problems. First, indexing terms and / or classificators have normally…

Abstract

In the third paragraph, the author states that ‘Conventional text retrieval systems suffer from a number of problems. First, indexing terms and / or classificators have normally to be assigned manually, which is a very time‐consuming process and can lead to severe problems with regard to inter‐indexer consistency.’ To what types of systems does this refer? From a content perspective it would appear to be addressing the problems of a keyword system, also referred to as a document coding system. Yet, they are referred to as ‘conventional text retrieval systems.’ Manual indexing is not a component of today's text retrieval system, elementary or advanced.

Details

Online and CD-Rom Review, vol. 17 no. 4
Type: Research Article
ISSN: 1353-2642

Article
Publication date: 1 January 1996

PETER INGWERSEN

The objective of the paper is to amalgamate theories of text retrieval from various research traditions into a cognitive theory for information retrieval interaction. Set in a…

2531

Abstract

The objective of the paper is to amalgamate theories of text retrieval from various research traditions into a cognitive theory for information retrieval interaction. Set in a cognitive framework, the paper outlines the concept of polyrepresentation applied to both the user's cognitive space and the information space of IR systems. The concept seeks to represent the current user's information need, problem state, and domain work task or interest in a structure of causality. Further, it implies that we should apply different methods of representation and a variety of IR techniques of different cognitive and functional origin simultaneously to each semantic full‐text entity in the information space. The cognitive differences imply that by applying cognitive overlaps of information objects, originating from different interpretations of such objects through time and by type, the degree of uncertainty inherent in IR is decreased. Polyrepresentation and the use of cognitive overlaps are associated with, but not identical to, data fusion in IR. By explicitly incorporating all the cognitive structures participating in the interactive communication processes during IR, the cognitive theory provides a comprehensive view of these processes. It encompasses the ad hoc theories of text retrieval and IR techniques hitherto developed in mainstream retrieval research. It has elements in common with van Rijsbergen and Lalmas' logical uncertainty theory and may be regarded as compatible with that conception of IR. Epistemologically speaking, the theory views IR interaction as processes of cognition, potentially occurring in all the information processing components of IR, that may be applied, in particular, to the user in a situational context. The theory draws upon basic empirical results from information seeking investigations in the operational online environment, and from mainstream IR research on partial matching techniques and relevance feedback. By viewing users, source systems, intermediary mechanisms and information in a global context, the cognitive perspective attempts a comprehensive understanding of essential IR phenomena and concepts, such as the nature of information needs, cognitive inconsistency and retrieval overlaps, logical uncertainty, the concept of ‘document’, relevance measures and experimental settings. An inescapable consequence of this approach is to rely more on sociological and psychological investigative methods when evaluating systems and to view relevance in IR as situational, relative, partial, differentiated and non‐linear. The lack of consistency among authors, indexers, evaluators or users is of an identical cognitive nature. It is unavoidable, and indeed favourable to IR. In particular, for full‐text retrieval, alternative semantic entities, including Salton et al.'s ‘passage retrieval’, are proposed to replace the traditional document record as the basic retrieval entity. These empirically observed phenomena of inconsistency and of semantic entities and values associated with data interpretation support strongly a cognitive approach to IR and the logical use of polyrepresentation, cognitive overlaps, and both data fusion and data diffusion.

Details

Journal of Documentation, vol. 52 no. 1
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 January 1983

KEVIN P. JONES

The Aslib Informatics Group and its predecessor the Co‐ordinate Indexing Group have made several attempts to understand the indexing process. This has been sought through seminars…

Abstract

The Aslib Informatics Group and its predecessor the Co‐ordinate Indexing Group have made several attempts to understand the indexing process. This has been sought through seminars and indexing projects. The seminars produced some data on an ad hoc basis and although most have been assembled they have not been reported previously. More recently a formal project, involving sixteen volunteer indexers, has been organized around five short New Scientist articles and the data from this exercise form the major component in the present study. An attempt has been made to correlate indexer performance with the original texts. There appears to be evidence to support the assertion that the selection of index entries is related to the structure of the original texts, especially the frequency of individual words.

Details

Journal of Documentation, vol. 39 no. 1
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 February 2005

Birger Hjørland

The purpose of this paper is to examine the importance and influence of the epistemologies: “empiricism”, “rationalism” and “positivism” in library and information science (LIS).

18740

Abstract

Purpose

The purpose of this paper is to examine the importance and influence of the epistemologies: “empiricism”, “rationalism” and “positivism” in library and information science (LIS).

Design/methodology/approach

First, outlines the historical development of these epistemologies, by discussing and identifying basic characteristics in them and by introducing the criticism that has been raised against these views. Second, their importance for and influence in LIS have been examined.

Findings

The findings of this paper are that it is not a trivial matter to define those epistemologies and to characterise their influence. Many different interpretations exist and there is no consensus regarding current influence of positivism in LIS. Arguments are put forward that empiricism and positivism are still dominant within LIS and specific examples of the influence on positivism in LIS are provided. A specific analysis is made of the empiricist view of information seeking and it is shown that empiricism may be regarded as a normative theory of information seeking and knowledge organisation.

Originality/value

The paper discusses basic theoretical issues that are important for the further development of LIS as a scholarly field.

Details

Journal of Documentation, vol. 61 no. 1
Type: Research Article
ISSN: 0022-0418

Keywords

Book part
Publication date: 30 August 2014

Yunseon Choi

This chapter aims to discuss the issues associated with social indexing as a solution to the challenges of current information organization systems by investigating the quality…

Abstract

Purpose

This chapter aims to discuss the issues associated with social indexing as a solution to the challenges of current information organization systems by investigating the quality and efficacy of social indexing.

Design/methodology/approach

The chapter focuses on the study which compared indexing similarity between two professional groups and also compared social tagging and professional indexing. The study employed the method of the modified vector-based Indexing Consistency Density (ICD) with three different similarity measures: cosine similarity, dot product similarity, and Euclidean distance metric.

Findings

The investigation of social indexing in comparison of professional indexing demonstrates that social tags are more accurate descriptions of resources and reflection of more current terminology than controlled vocabulary. Through the characteristics of social tagging discussed in this chapter, we have a clearer understanding of the extent to which social indexing can be used to replace and improve upon professional indexing.

Research limitations/implications

As investment in professionally developed web directories diminishes, it becomes even more critical to understand the characteristics of social tagging and to obtain benefit from it. In future research, the examination of subjective tags needs to be conducted. A survey or user study on tagging behavior also would help to extend understanding of social indexing practices.

Details

New Directions in Information Organization
Type: Book
ISBN: 978-1-78190-559-3

Article
Publication date: 1 August 2001

Raija Lehtokangas and Kalervo Järvelin

This article investigates how consistent different newspapers are in their choice of words when writing about the same news events. News articles on the same news events were…

Abstract

This article investigates how consistent different newspapers are in their choice of words when writing about the same news events. News articles on the same news events were taken from three Finnish newspapers and compared in regard to their central concepts and words representing the concepts in the news texts. Consistency figures were calculated for each set of three articles (the total number of sets was sixty). Inconsistency in words and concepts was found between news articles from different newspapers. The mean value of consistency calculated on the basis of words was 65 per cent; this however depended on the article length. For short news wires consistency was 83 per cent while for long articles it was only 47 per cent. At the concept level, consistency was considerably higher, ranging from 92 per cent to 97 per cent between short and long articles. The articles also represented three categories of topic (event, process and opinion). Statistically significant differences in consistency were found in regard to length but not in regard to the categories of topic. We argue that the expression inconsistency is a clear sign of a retrieval problem and that query expansion based on semantic relationships can significantly improve retrieval performance on free‐text sources.

Details

Journal of Documentation, vol. 57 no. 4
Type: Research Article
ISSN: 0022-0418

Keywords

Article
Publication date: 19 April 2011

Li‐Chen Tsai, Sheue‐Ling Hwang and Kuo‐Hao Tang

Expert and novice readers tag documents with different descriptions; this study is intended to discover which readers would generate the most reliable and most representative sets…

1155

Abstract

Purpose

Expert and novice readers tag documents with different descriptions; this study is intended to discover which readers would generate the most reliable and most representative sets of tags.

Design/methodology/approach

One group of experts and one group of novices were recruited. These two groups were asked to provide tags for document bookmarks in a Mozilla Firefox browser. In the experimental analysis we defined two measures – similarity and relevance – to describe the differences between the two groups.

Findings

Tags chosen by experts yielded better similarity and relevance values in all analyses. Tags chosen by the expert group had higher commonality in pairwise similarity analysis; moreover, the relevance analysis showed that tags chosen by experts reflected better understanding of the content.

Originality/value

Tagging behavior has become highly popular on the web, and its study has commercial merit. Tags from experts represent the structure behind the knowledge involved; expert representation may be vastly more helpful than novice representation for promoting understanding of content in an era characterized by an explosion of information.

Details

Online Information Review, vol. 35 no. 2
Type: Research Article
ISSN: 1468-4527

Keywords

1 – 10 of 38