Search results

1 – 10 of 20
Article
Publication date: 1 January 1993

Ankie Visschedijk and Forbes Gibb

This article reviews some of the more unconventional text retrieval systems, emphasising those which have been commercialised. These sophisticated systems improve on conventional…

Abstract

This article reviews some of the more unconventional text retrieval systems, emphasising those which have been commercialised. These sophisticated systems improve on conventional retrieval by using either innovative software or hardware to increase retrieval speed or functionality, precision or recall. The software systems reviewed are: AIDA, CLARIT, Metamorph, SIMPR, STATUS/IQ, TCS, TINA and TOPIC. The hardware systems reviewed are: CAFS‐ISP, the Connection Machine, GESCAN,HSTS,MPP, TEXTRACT, TRW‐FDF and URSA.

Details

Online and CD-Rom Review, vol. 17 no. 1
Type: Research Article
ISSN: 1353-2642

Keywords

Article
Publication date: 1 June 1991

Alan F. Smeaton

Current approaches to text retrieval based on indexing by words or index terms and on retrieving by specifying a Boolean combination of keywords are well known, as are their…

Abstract

Current approaches to text retrieval based on indexing by words or index terms and on retrieving by specifying a Boolean combination of keywords are well known, as are their limitations. Statistical approaches to retrieval, as exemplified in commercial products like STATUS/IQ and Personal Librarian, are slightly better but still have their own weaknesses. Approaches to the indexing and retrieval of text based on techniques of automatic natural language processing (NLP) may soon start to realise their undoubted potential in terms of improving the quality and effectiveness of information retrieval. In this article we will explore what that potential is. We will divide information retrieval functionality into conceptual and traditional information retrieval and we will examine some of the current attempts at using various NLP techniques in both the indexing and retrieval operations.

Details

Online Review, vol. 15 no. 6
Type: Research Article
ISSN: 0309-314X

Article
Publication date: 1 April 1993

In the third paragraph, the author states that ‘Conventional text retrieval systems suffer from a number of problems. First, indexing terms and / or classificators have normally…

Abstract

In the third paragraph, the author states that ‘Conventional text retrieval systems suffer from a number of problems. First, indexing terms and / or classificators have normally to be assigned manually, which is a very time‐consuming process and can lead to severe problems with regard to inter‐indexer consistency.’ To what types of systems does this refer? From a content perspective it would appear to be addressing the problems of a keyword system, also referred to as a document coding system. Yet, they are referred to as ‘conventional text retrieval systems.’ Manual indexing is not a component of today's text retrieval system, elementary or advanced.

Details

Online and CD-Rom Review, vol. 17 no. 4
Type: Research Article
ISSN: 1353-2642

Article
Publication date: 1 March 1996

Text retrieval is not a new technology — it has been familiar to library and information professionals for many years. For example, among the vendors we will look at here, PLS…

Abstract

Text retrieval is not a new technology — it has been familiar to library and information professionals for many years. For example, among the vendors we will look at here, PLS dates from 1983 and Fulcrum from 1984 — children compared to, say, IBM and Microsoft, but venerable in the general terms of the IT industry. Recently, however, several companies offering text retrieval services have begun to raise their profile — so much so that Delphi Consulting reports the text retrieval market has recently broken the half billion dollar barrier. Many of these companies are gaining the financial clout to add features to their products, or diversify, or head off in a completely new direction. Here we give a round‐up of some of them.

Details

Online and CD-Rom Review, vol. 20 no. 3
Type: Research Article
ISSN: 1353-2642

Article
Publication date: 1 April 1993

Clifford Harkness

This paper will discuss the development of an information management system providing access to radio archive material. A formal television and radio archive agreement between the…

Abstract

This paper will discuss the development of an information management system providing access to radio archive material. A formal television and radio archive agreement between the BBC in Northern Ireland and the Ulster Folk and Transport Museum in 1989 led to the decision to use STATUS for the management of the radio archive. An example of the record structure is given and some details of ways of searching.

Details

Program, vol. 27 no. 4
Type: Research Article
ISSN: 0033-0337

Article
Publication date: 1 May 1996

UK body for Internet registration. A new national body in the UK responsible for registering Internet names has held its first meeting. Nominet UK is a not‐for‐profit company set…

Abstract

UK body for Internet registration. A new national body in the UK responsible for registering Internet names has held its first meeting. Nominet UK is a not‐for‐profit company set up with the support of all sections of the UK Internet industry and which derives its authority from the Internet Assigned Numbers Authority. Until its creation Internet name registration was done on a voluntary basis by the UK Education & Research Networking Association but the Internet's increasing popularity, with 200 new registrations per week, put a strain on this arrangement.

Details

Online and CD-Rom Review, vol. 20 no. 5
Type: Research Article
ISSN: 1353-2642

Article
Publication date: 1 February 1993

BRIAN VICKERY and ALINA VICKERY

There is a huge amount of information and data stored in publicly available online databases that consist of large text files accessed by Boolean search techniques. It is widely…

Abstract

There is a huge amount of information and data stored in publicly available online databases that consist of large text files accessed by Boolean search techniques. It is widely held that less use is made of these databases than could or should be the case, and that one reason for this is that potential users find it difficult to identify which databases to search, to use the various command languages of the hosts and to construct the Boolean search statements required. This reasoning has stimulated a considerable amount of exploration and development work on the construction of search interfaces, to aid the inexperienced user to gain effective access to these databases. The aim of our paper is to review aspects of the design of such interfaces: to indicate the requirements that must be met if maximum aid is to be offered to the inexperienced searcher; to spell out the knowledge that must be incorporated in an interface if such aid is to be given; to describe some of the solutions that have been implemented in experimental and operational interfaces; and to discuss some of the problems encountered. The paper closes with an extensive bibliography of references relevant to online search aids, going well beyond the items explicitly mentioned in the text. An index to software appears after the bibliography at the end of the paper.

Details

Journal of Documentation, vol. 49 no. 2
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 January 1992

Richard L. Jones

The AIDA project is a program of research being carried out by Computer Power in Canberra, Australia, in collaboration with the Australian Parliament. Its primary objective is to…

Abstract

The AIDA project is a program of research being carried out by Computer Power in Canberra, Australia, in collaboration with the Australian Parliament. Its primary objective is to develop practical methods for carrying out document content analysis with minimal human intervention. Following a very successful independent assessment of the techniques, the first commercial‐strength tool has now been developed. It links the different AIDA analyses (point form summary, keywords, and so on) with the original document to form a “complete” hyperdocument. The different techniques employed by AIDA to achieve its results are described.

Details

Library Hi Tech, vol. 10 no. 1/2
Type: Research Article
ISSN: 0737-8831

Article
Publication date: 1 January 1991

E. MICHAEL KEEN

Term position information, as provided in some Boolean systems in the form of field restriction and term proximity, is reviewed and its value assessed. Non‐Boolean retrieval in…

Abstract

Term position information, as provided in some Boolean systems in the form of field restriction and term proximity, is reviewed and its value assessed. Non‐Boolean retrieval in the form of the ranked output experiment has not so far used term position information but has concentrated on schemes of term weighting. The use of term proximity devices is proposed here by analogy with Boolean techniques and seven algorithms are devised to incorporate the ideas of sentence matching, proximate terms, term order specification and term distance computations. It is hypothesised that term position will act as a precision device. A new search experiment is then described in which a test collection is processed into sentences and then output ranking using term position is obtained. Results are given for five algorithms compared against quorum searching as the benchmark. The best result increased the precision ratio by 18% and used proximate matching term pairs in sentences plus a distance component.

Details

Journal of Documentation, vol. 47 no. 1
Type: Research Article
ISSN: 0022-0418

Article
Publication date: 1 January 1996

PETER INGWERSEN

The objective of the paper is to amalgamate theories of text retrieval from various research traditions into a cognitive theory for information retrieval interaction. Set in a…

2528

Abstract

The objective of the paper is to amalgamate theories of text retrieval from various research traditions into a cognitive theory for information retrieval interaction. Set in a cognitive framework, the paper outlines the concept of polyrepresentation applied to both the user's cognitive space and the information space of IR systems. The concept seeks to represent the current user's information need, problem state, and domain work task or interest in a structure of causality. Further, it implies that we should apply different methods of representation and a variety of IR techniques of different cognitive and functional origin simultaneously to each semantic full‐text entity in the information space. The cognitive differences imply that by applying cognitive overlaps of information objects, originating from different interpretations of such objects through time and by type, the degree of uncertainty inherent in IR is decreased. Polyrepresentation and the use of cognitive overlaps are associated with, but not identical to, data fusion in IR. By explicitly incorporating all the cognitive structures participating in the interactive communication processes during IR, the cognitive theory provides a comprehensive view of these processes. It encompasses the ad hoc theories of text retrieval and IR techniques hitherto developed in mainstream retrieval research. It has elements in common with van Rijsbergen and Lalmas' logical uncertainty theory and may be regarded as compatible with that conception of IR. Epistemologically speaking, the theory views IR interaction as processes of cognition, potentially occurring in all the information processing components of IR, that may be applied, in particular, to the user in a situational context. The theory draws upon basic empirical results from information seeking investigations in the operational online environment, and from mainstream IR research on partial matching techniques and relevance feedback. By viewing users, source systems, intermediary mechanisms and information in a global context, the cognitive perspective attempts a comprehensive understanding of essential IR phenomena and concepts, such as the nature of information needs, cognitive inconsistency and retrieval overlaps, logical uncertainty, the concept of ‘document’, relevance measures and experimental settings. An inescapable consequence of this approach is to rely more on sociological and psychological investigative methods when evaluating systems and to view relevance in IR as situational, relative, partial, differentiated and non‐linear. The lack of consistency among authors, indexers, evaluators or users is of an identical cognitive nature. It is unavoidable, and indeed favourable to IR. In particular, for full‐text retrieval, alternative semantic entities, including Salton et al.'s ‘passage retrieval’, are proposed to replace the traditional document record as the basic retrieval entity. These empirically observed phenomena of inconsistency and of semantic entities and values associated with data interpretation support strongly a cognitive approach to IR and the logical use of polyrepresentation, cognitive overlaps, and both data fusion and data diffusion.

Details

Journal of Documentation, vol. 52 no. 1
Type: Research Article
ISSN: 0022-0418

1 – 10 of 20