Information Representation and Retrieval in the Digital Age

Gobinda Chowdhury (University of Strathclyde)

Online Information Review

ISSN: 1468-4527

Article publication date: 1 October 2004




Chowdhury, G. (2004), "Information Representation and Retrieval in the Digital Age", Online Information Review, Vol. 28 No. 5, pp. 377-378.



Emerald Group Publishing Limited

Copyright © 2004, Emerald Group Publishing Limited

Information retrieval has over the years been described by various synonyms, including information storage and retrieval, information organisation and retrieval, information processing and retrieval, and so on. Heting Chu, the author of this book, prefers the phrase “information representation and retrieval”. The book has 12 chapters, each followed by a list of references, and it ends with a subject index. There are four figures and 13 tables.

Chapter 1 begins with a brief historical account of the field of information retrieval. Although many prominent names appear there, it is not exhaustive, and some important names, for example Ranganathan, are missing. An overview of the concept of information retrieval is then provided. Chapter 2 discusses the various approaches to information representation, including indexing, summarisation, categorisation, etc. Chapter 3 discusses the relatively recent concepts related to information representation such as metadata and various related tools and techniques for representation of textual and multimedia information. The discussions are simple, and therefore easy to understand, but at times are too brief and thus fail to highlight the complexity of the topics discussed. Chapter 4 discusses the issues of natural language vs controlled vocabularies in the context of indexing and representation of information.

While Chapters 2‐4 cover different aspects of document representation, Chapter 5 covers the retrieval aspects ‐ retrieval techniques and query representation. It discusses various search techniques such as Boolean searching, truncation, proximity and field searching. While the discussions here are quite simple and easy to follow, they could be enhanced significantly with examples drawn from real‐life information retrieval systems. The second part of Chapter 5 discusses search strategy, describing the various stages of formulating a search. Again, appropriate examples would be useful. Some topics (e.g. automatic query formulation and modification) have been discussed too briefly. Chapter 6 discusses different techniques used for gaining access to information ‐ for example, searching, browsing and a combination of both. Again, one may feel the lack of real‐life examples illustrating each approach.

Chapter 7 discusses classical information retrieval models such as the Boolean, vector‐space and probabilistic retrieval models. The models have been discussed well in simple terms, dwelling on the strengths and weaknesses of each mode. Surprisingly, no mathematical formulae have been used in this chapter which, though makes it easy for non‐technical readers, hides the complexities of the IR models. Chapter 8 discusses the features, as well as comparisons, of four different types of IR systems: online IR systems, CD‐ROMs, OPACs, and Internet IR. This chapter, especially the sections related to the Internet IR, are well written with good examples. Chapter 9 covers topics related to the multilingual and multimedia IR. The author traces the history of multilingual IR with TREC‐4 onwards, though it would be useful for the readers to know that linguistic approaches to IR have a much longer history dating back to almost four decades. The sections on multimedia and sound IR have been written well in simple terms. In Chapter 10 the author covers the user aspects of IR and discusses the cognitive models, although only Ingwersen is discussed here. The later part of this chapter discusses user interface issues, although the author here has preferred the phrase “user‐system interactions” over “user interfaces”.

Chapter 11 discusses various aspects of IR evaluation, including the evaluation parameters and criteria for evaluation of various types of IR systems such as online databases, OPACs and Internet IR systems. It then discusses two series of retrieval evaluation experiments, the Cranfield tests, the first formal IR evaluation experiments, and the most recent set of experiments reported under the TREC series. This is a well‐written chapter and is followed by a long list of good references. The title of Chapter 12 is a bit ambitious, or misleading, in the sense that it gives an impression that it would cover various AI applications in IR, though in reality it only covers some NLP and knowledge‐based approaches to IR such as intelligent agents. This chapter is too short to cover such a huge area of applications and research.

Overall, the book covers a broad spectrum of IR, and the discussions are presented in very simple and non‐technical terms, though by doing so, the author has hidden the real complexities of some of the topics concerned. Perhaps the book could be made more attractive by using some figures illustrating the discussions, but then the overall length of a textbook is always a constraint for every author. As the author states in the preface, this book is designed for newcomers to the field, and I feel that it will be a good starting point for information science students and information professionals who are new to the field.

Related articles