CitationDownload as .RIS
Emerald Group Publishing Limited
Copyright © 2000, MCB UP Limited
LITA Pre-Conference at ALA: No Digital Library Bible in the Big Picture
The LITA pre-conference at the 2000 ALA Annual Conference, "Digital Libraries and Digital Imaging: The Big Picture," was held in Chicago on 7 July 2000, and drew a diverse audience of nearly 100 participants from university and research libraries, public libraries, corporate libraries, museums, cultural heritage initiatives, and vendors. The six presenters all have extensive experience working with digital images and digital libraries and their individual presentations were all excellent and informative. However, the conference as a whole suffered from a lack of focus and a lack of clearly defined goals.
According to the conference Web site, the conference was to be "an opportunity for thoughtful discussion on the meaning and cultural impact of digital libraries." This broad and little addressed topic would be a fascinating subject of discussion, but unfortunately the discussion never got there. The conference instead focused on a number of specific issues related to digital libraries and digital images. Some of the topics included user satisfaction, staffing and administrative concerns, delivery of images from diverse collections, metadata standards, and image longevity.
These all are important issues for creators of digital libraries, but the coverage in this conference was too broad and haphazard, with not enough systematic attention to each issue. In the end, one was left somewhat intimidated at the vast number of factors to be aware of when creating a digital library, but with little concrete information on how to go about beginning to create such a library. And it was this practical guidance that much of the audience seemed to want.
Bernard Reilly, Director of Research and Access at the Chicago Historical Society, opened the program with a discussion of the professional and administrative challenges of using digital technology in a museum and special collections setting. Factors such as the cost of digital technology, intellectual property issues, and the unsettled nature of digital media all affect the types of skills that are demanded of staff as organizations shift more of their activities to the digital environment. Reilly identified several issues of importance for managers of such organizations.
First is the convergence of library activities and resources, which is transforming the work of administering library special collections. This convergence can be seen in a number of areas. There is the convergence of text, image, and sound media, all being transformed to digital files and managed together. Then there is the convergence of archiving and interpreting. Transforming material to digital form is really the creation and presentation of a new resource, which requires libraries to think even more diligently about the meaning and context of the material they are presenting. Another convergence is that of commercial and non-profit activities, which in turn leads to a convergence of a library's overall activities: marketing, finance, outreach, technical services, public services, and others. In this new environment teamwork is essential. Librarianship, said Reilly, is moving from a staid to a high-risk profession, one that must embrace new technologies or fall behind.
A second issue discussed by Reilly was the costs of working in the digital environment. He noted, as did others throughout the conference, that the cost of creating and maintaining digital resources is very high, especially when one considers that it does not replace but adds to the cost of running the library's traditional activities and functions. These costs include hardware and software, ongoing maintenance of systems, and staff development and retention. In the library world in particular, salaries are too low for technically skilled people, making staff loyalty more difficult to maintain.
With these convergences and costs comes a need for new skill sets. Reilly noted several new skills that are being required for special collections professionals. These include being conversant with digital and other new media technology, understanding of intellectual property issues and other legal considerations, and communication skills. This last is particularly important, as librarians must be able to negotiate with donors and with vendors, as well as going beyond merely presenting material to actually engaging the broader audience of users in a variety of situations.
An Image Database in Action
John Weise, Coordinator of Image Services at the University of Michigan's Digital Library Production Service (DLPS), took the stage next with what was probably the most technical and immediately practical presentation of the conference. Weise demonstrated the DLPS Image Services "federating model" for organizing and providing access to images from disparate sources. Images are located in many places on the University of Michigan campus: in libraries, museums, archives, special projects, and other collections. The repositories or departments where these images are held may have limited resources to manage the images, and even fewer resources to provide online access to them. The role of DLPS Image Services is to leverage resources to facilitate the task of getting these images online, and to "federate the content for centralized discovery," that is, to gather the images together so that they can be searched in one place.
The DLPS system currently gives users three important tools: the ability to search a single collection of images, the ability to search across multiple collections, and an image viewing functionality that allows users to zoom in and out of images and determine the size of the images. In the near future DLPS plans to expand the system by adding a function to allow users to create and use "personal collections", adding a results sorting capability, and enhancing the authentication/authorization mechanisms.
After a brief demonstration of how the system works for the user, Weise talked about some of the administrative and technical issues, which seemed of great interest to the audience. Using a single transformation process for all collections, data about a collection are transformed from a database format to SGML. One SGML DTD is used for all the collections. The DTD is written is such a way that information and description about an object can be preserved in as much detail as the original institution desires, while at the same time allowing for all information to be mapped to a "Dublin Core-like" structure to enable cross collection searching.
The audience was very interested in the Image Services system and while time allowed asked numerous questions about exactly how the transformation process worked.
The User's Perspective
Christie Stephenson, Weise's colleague and the Assistant Head of the University of Michigan DLPS, went next with a presentation on "the user's perspective." Her presentation focused on three questions that a user might ask: Can I find what I need? Can I trust what I find? Can I use what I find?
Can I Find What I Need?
Many Web search engines now offer the capability to search for images, but still do not provide satisfactory results. One problem, which Beth Sandore discussed later in her presentation, is that the metadata used to describe an image, and used by search engines to find an image, do not always sufficiently correspond to the way we think about an image. We may think about an image in terms of tone, color, emotional impact, or numerous other criteria that are not described in the metadata. The role of the eye in determining the relevance of an image is very important, and image search engines should err on the side of recall (giving more images to the user) rather than precision (trying to return precisely what the user wants), and then allow the user to quickly and easily scan through images to determine what is relevant.
Another problem with searching for images on the Web is that the images may not even be available to search engines. Most image metadata are stored in databases that are not available to Webcrawlers or search engines. One remedy for this is "metadata harvesting," which involves making some metadata in internal databases available at a level more accessible to search engines. However, there is a tension here between increased access and decreased context. This "flattening" of data would increase accessibility; but by allowing users to jump directly to an image located deep within a database, it could also break the link between the data and the larger collection. The "decontextualization" of data affects the question of trust and authenticity, which is Stephenson's next topic.
Can I Trust What I Find?
Users must be able to be sure that the information they are using is accurate, authoritative, and authentic. The specific requirements will vary according to the user and what the material will be used for, but at the heart of this issue lies the institution's commitment to the persistence of the material, technical matters that affect quality and longevity of the material, and the preservation of the intellectual context of the original material.
Can I Use What I Find?
What users do with the images is another question that has not been examined thoroughly. Stephenson pointed out that users will be increasingly expecting more sophisticated desktop management tools to manage (access, find, use, share) remote resources and objects, and help make effective use of collections. Image providers are currently experimenting with tools and functionality such as drag-and-drop, annotation, personal collection management, "shopping cart," and "light tray." Technical and licensing issues can also have an effect on the usability of images. Images that, due to licensing restrictions, are available only in thumbnail size may be useless for some uses.
Beth Sandore, Coordinator of Imaging Projects at the University of Illinois at Urbana-Champaign, also addressed issues of searching for and using images. Like Stephenson, she pointed out that we think about visual images in a less structured way than text, and she described some new developments in image manipulation and searching tools. Viewing tools such as MrSID and LookSee provide zoom and pan functions, and allow the user to view an image at different resolutions. A number of new search tools allow users to search based on characteristics of the image itself, rather than searching through a text description of an image. QBIC allows users to enter data about how much of each color they want, and where the colors should be located within the image. QBIC will then return images that fit those parameters. BlobWorld finds images that match shapes and patterns chosen by the user. Both BlobWorld and QBIC seem to have some success at finding images that are similar to other images, at least in terms of color pattern or shape, but these tools are still at an early stage of development. It is also not clear if this type of matching is the most useful way for users to search for images. In many cases it is probably not.
Sandore then spoke about users and uses of digital images. Bringing together data from a number of user studies (a Museum Educational Site Licensing study of instructors and students in particular), Sandore summarized some of the findings. Many of the findings will come as no surprise, although they still may not always be fully considered when designing an image database. Users are impatient with long download waits, they want quick navigation within a resource, and they want both image and text together. Users also said they were content with standard 72dpi resolution images, although Sandore suggested user expectations in this regard would increase as the technical capabilities of monitors and display units increased.
Some basic suggestions for designing an image database can be made from these findings: have a simple interface which minimizes scrolling and jumping, flatten database hierarchies to allow quicker navigation within the database, have previews and overview screens to aid navigation, and provide for a variety of search functions based on the domain and type of material.
Sandore concluded by stressing the importance of conducting large scale user studies (as opposed to using focus groups which provide less comprehensive results) to find out how resources are being used, how they are affecting research, and what tools will be most effective for users.
The Commercial Perspective
The next presentation was from Cindy Miller of Endeavor Information Systems. Miller demonstrated Endeavor's latest product, ENCompass, a system intended to allow libraries to easily build, manage and search across disparate digital collections, and which is able to handle all types of content in an integrated way, rather than having many types of collections (images, text, finding aids) that must all be searched separately.
The strength of Miller's presentation was her clear explanation of how ENCompass conceives of and manages digital objects. ENCompass treats each digital item, regardless of format (such as a single image or a section of text), as a digital object which can be stored in any location. Users of the system (in this case curators, librarians, or others building a digital collection for an institution) can combine objects of their choice into containers. Each container is then also treated as a separate object, which can then be linked to or included in still larger containers. Objects can be in many containers at once, allowing many collections to be built for different purposes, but all based on the same master digital file.
Throughout her presentation, Miller stressed the importance of standards such as Dublin Core, without which systems like ENCompass cannot work. While much work has been done in developing standards for descriptive metadata (describing the original object), there remains a great need for developing standards for structural and technical metadata relating to the digital files.
At the time of her presentation, ENCompass was still in development and not fully functional, so Miller's demonstration of the product was less satisfying than her explanation of it, and it was hard to get a sense of what the system would actually be like to use. Endeavor is working with Cornell University and over the next year plans to roll out ENCompass and develop additional functionality for the system.
Digital Libraries 101
Champion imager Howard Besser of UCLA concluded the day with a wide-ranging two-hour presentation titled "Creating Working Digital Libraries." Although provocative, entertaining, and knowledgeable, Besser tried to cover too much ground too quickly, and as a listener it was difficult to get a solid grasp of any one of the topics he discussed. It was like taking a semester-long course in digital libraries condensed into two hours. Besser discussed topics such as the developmental stages of building a digital library (first experiment with methods, then build real operational systems, then build interoperable operational system), the need for five types of metadata standards to allow interoperability (descriptive, administrative, structural, discovery, and terms and conditions), file formats and longevity issues,best practices for managing digital projects (including scanning best practices, and workflow and management issues), persistent IDs and URLs, and changes in intellectual property laws, among other topics.
He concluded with a "some wild musings" to the effect that there will be a movement towards packages and containers of metadata and away from MARC. Packages such as Warwick Framework and the Dublin Core with qualifiers are modular, community-based metadata structures, which "allow one community to express important nuances and qualifications, while still making the basic importance available to communities with simple needs." For example, the library community can record such information as title, alternate title, and transliterated title, yet all these will be found under a simple Web search under "title."
A Digital Library Bible?
The conference as a whole, and Besser's talk in itself, did indeed address "the big picture." Yet it was still unclear what the audience was supposed to take away from the conference. During the discussion and open question time after lunch, it became clear that many in the audience wanted concrete, practical information and guidelines for creating a digital collection. One audience member actually asked if there was a digital library "Bible" that would contain such information. Unfortunately, as the presenters made clear, no such Digital Library Bible exists. The specifics for creating digital collections and the decisions that must be made are highly dependent on the unique circumstances of each organization: its financial and technical resources, its staff expertise, its mission, the type of material with which it is working, and the needs and skills of its users. What may work for one organization may be entirely inappropriate or impossible for another. But a set of practical guidelines and questions to ask would clearly be useful, and the presenters did cite a number of recent or upcoming publications that try to provide some guidance in this area. For those completely new to the world of digital libraries, the LITA pre-conference touched on many of the issues they need to consider, but failed to explore them in a systematic manner. For those who are already familiar with digital library issues, there were a number of interesting moments, but the conference really did not present anything new. Each presentation was interesting in and of itself, but none of them could cover any given issue comprehensively in a single presentation, and taken as a whole the conference ended up feeling too haphazard and un-focused to provide a very meaningful examination of any given issue.
Sandore, B., "Findings of the Instructor/ Student Survey", http://www.getty.edu/museum/mesl/reports/mesl_ddi_98/p5_04.1_sand.html
Some titles mentioned were Kenney, A. (2000), Moving Theory into Practice: Digital Imaging for Libraries and Archives, Research Libraries Group, Mountain View, CA; Besser, H. (1995), Introduction to Imaging: Issues in Constructing an Image Database, Getty Art History Information Program, Santa Monica, CA; and Guides to Quality in Visual Resource Imaging (2000), Research Libraries Group, at http://www.rlg.org/visguides/.
Brian Rosenblum is Digital Projects Librarian, Digital Library Program Development, University of Michigan, Ann Arbor, Michigan. firstname.lastname@example.org