Facilitating Access to the Web of Data: A Guide for Librarians

Robin Yeates (E‐library Systems Officer, London Borough of Barnet Libraries, London, UK)

Program: electronic library and information systems

ISSN: 0033-0337

Article publication date: 20 April 2012

137

Keywords

Citation

Yeates, R. (2012), "Facilitating Access to the Web of Data: A Guide for Librarians", Program: electronic library and information systems, Vol. 46 No. 2, pp. 283-285. https://doi.org/10.1108/00330331211221918

Publisher

:

Emerald Group Publishing Limited

Copyright © 2012, Emerald Group Publishing Limited


Helping users turn questions into answers is one thing that librarians have done well. Until recently they have relied mainly on printed works, extended mostly by looking things up in web‐based representations of printed sources. David Stuart, a researcher with a PhD in information science, thinks that librarians now need to move on to include the use of “data” in all forms, and especially data that may be widely scattered and can be accessed and manipulated in powerful ways to answer new questions that were not considered by the publishers of the information.

The introduction of this brief but dense book explains the title and the avoidance there of the terms semantic web, linked data or open data, which are all relevant concepts. The data being referred to in the book is “structured data in a machine‐readable format that is being made publicly available online by individuals and organizations from every sector of society”. Library and information professionals have a key role in realizing the value in this “web of data”. The author explicitly addresses the point that this is not intended as a technical guide on how to publish or use such a web of data; instead it is a guide for these professionals to the issues that surround it. This is a more than usually thoughtful introduction, considering details in enough depth to be of interest to teachers, learners, managers and practitioners. Headings develop a conceptual framework for understanding the book, progressing from “The changing role of the librarian in the world of the web”, and “The librarian and Web 2.0” to “The librarian and the web of data”.

The work is not full of bullet points, but rather is a well‐argued text with suitable examples and references, mainly to web‐based resources. The seven chapters are structured to address Dominique Foray's (Foray, 2004) four conditions that contribute to an effective knowledge economy: “the size of the community, the cost of sharing the knowledge, the clarity of what gets shared and the cultural norms of the community”.

The growth in open science and open government and other sectors is examined in the first chapter, suggesting a role for the library and information professional. This is a general discussion, before other chapters move to the more specific formats, tools and approaches. The semantic web concept, where data can be understood by machines, is clarified and the author explains that this is not the same as artificial intelligence, but requires “human intervention and guidance”.

The chapter on data silos is particularly insightful. Many librarians are aware of the concept of data residing in places where it cannot easily be accessed by potential users in conjunction with data elsewhere. The author here brings the notion and impact right up to date by highlighting Google Docs and similar cloud services as potentially being data silos. This may challenge many people's idea that placing a text, spreadsheet or presentation on the cloud necessarily means that the included data is fully accessible to people and computers and can be reused easily. An examination in a little detail of application programming interfaces (APIs) shows this not to be true, because such approaches do not necessarily use widely accepted standards. There is a risk of libraries and individuals publishing openly, but sub‐optimally, because they do not consider the wider environment when doing so. The author discusses advantages and disadvantages of such an approach.

Chapter 4 looks at the original vision of the World Wide Web Consortium for the semantic web, RDF and ontologies that can relate data to produce information. This is a very clear and useful history of Berners Lee's notion of a five star open data schema using RDF triples to express meaning in machine readable ways. The progress, or lack of it, over the last decade, is considered. This is the core of the innovations introduced in the book. So many specific ideas have arisen during the period that readers from outside the field could easily be put off attempting to understand them. Here, things such as SPARQL, OWL and URIs are presented in an exemplary way, with clear logical progression but always comments on the wider picture and how it relates to library professionals. The technical stuff is concluded in the following chapter on embedded semantics, which provides an entry‐level hands‐on approach to engagement with linked data that could be inspiring for many information professionals of a slightly more technical bent. However, the emphasis is always on the fact that we cannot understand every technical detail, but we must understand the implications for the LIS profession.

It is natural, then, that the following chapter looks at “The library and the web of data”, a phrase which by now means something specific. We have a well‐founded reworking of Ranganathan's Laws that are enhanced by replacing books with data.

Finally the effects of embracing or ignoring the web of data are discussed and the book concludes with nine practical steps to becoming a data librarian.

There are a couple of quibbles about the book. The quality of the printed screenshots used is rather poor, and some of the illustrations, such as the Linked Open Data cloud (figure 4.2) are illegible, although they refer to URLs for further exploration: perhaps it would have been better to leave the screenshots out altogether from printed versions of the book. Inevitably, some of the comments on tools mentioned may already be out of date, although the principles remain true. Semantic Radar software is not currently available for the current version 9.01 of Firefox, for example, and it is not clear how to proceed with more up‐to‐date software.

None of this detracts significantly from the book. The issue of professional boundaries is not shirked, nor is the presentation of examples of specific standards that may or may not date quickly. If readers wish to compare the points made in this review with those made by other reviewers about the same book, they will no longer despair of an instant solution. If staff in a public library wonder how they can help their users make use of public responses to Freedom of Information requests to improve public services, they need look no further than this volume for inspiration (and the mentioned website: www.whatdotheyknow.com). Historical precedents suggest that the amount of technical innovation within many public libraries may be declining, while outsourcing and purchasing of “data silos” grows. LIS professionals should use this eminently readable and inspirational book to engage with the web of data and prepare better for cultural changes within and outside libraries.

References

Foray, D. (2004), The Economics of Knowledge, MIT Press, Cambridge, MA.

Related articles