Emerald Group Publishing Limited
Copyright © 2000, MCB UP Limited
SLA Report: Jean Piety
Metadata for Libraries: The OCLC CORC Environment
In his introduction on the electronic information age, Bill Carney recounted episodes with his children. One had no idea what a catalog card was and the other insisted on using the Web, not books, for a report on ancient Egypt. He wanted her to look in books; she felt that if it is not online, it is not current. He quoted phrases used by the younger generation like "Do you Yahoo!?" and "My parents are so 404" (disconnected). The communication gap continues with the Web, for he finds the Web world full of challenges due to rapid growth, print versus electronic issues, and rising expectations. Solutions to these challenges lie in accessibility to Web resources.
Libraries have met these challenges by loading their collections into online public access catalogs (OPACs). Problems include extensive duplication of effort between libraries and within libraries, currency of entries, broken links, changing content, and new pages. The lack of integration with local resources continues to exist.
OCLC Cooperative Online Resource Catalog (CORC) gives librarians the chance to select quality electronic resources needed locally, to describe those resources using MARC or Dublin Core, to create access to those resources through local OPACs via CORC records, and to share efforts globally. He explained that the CORC creed covers cooperation, acts as a source for authoritative metadata, provides durable records in multiple formats, is standards-based, and offers leverage for unique collections.
The CORC founders' phase included more than 450 libraries of all types. Users included technical services, public service, subject bibliographers, collection development staff, and patrons. CORC provides the automated tools for record creation, classification, subject heading assignment, URL maintenance, authority control, and Pathfinder creation. It means "less elbow grease required" for the subscriber.
Mr Carney defines metadata as "structured data about data". Types include MARC 21, Dublin Core, text encoding initiative, government information locator service, and encoded archival description. Special librarians are using CORC for a resource catalog, authorities, Pathfinders, and Web Dewey. CORC handled 235,144 records by 7 June 2000. He described Dublin Core and its 15 elements for simple resource description. Dublin Core is designed to support interoperability and can be translated into 26 languages. Entries cataloged in Dublin Core go into OCLC's World Catalog. He used the Museum of HP calculators (www.hpmuseum.org) to show that, by entering the URL (universal resource locator), he could create a new record, using fields as in MARC cataloging, but including the data from the URL. CORC links from the LC authority file back into the record. Another CORC capability is automated URL maintenance, for entries are scanned regularly and holding libraries are alerted; thus maintenance is shared. Some examples were Ice Treasures of the Incas (a National Geographic site at www.nationalgeographic.com/mummy/), the state of Connecticut town profiles (one site is http://www.hickoryhill.com/Connecticut/Towns/), and CISTI, Canada Institute for Scientific and Technical Information (www.cisti.nrc.ca/cisti/cisti_e.shtml). CORC gives you the option of using your own server or keeping all on the OCLC server. Benefits include integrated Web and physical resources, making local resources available to locals, bringing global resources to local patrons, automatic creation and maintenance, applying authority control to Web resources, and being Web-based. System requirements included OCLC cataloging service PRISM authorization, standard Web browser, and Internet connection. All is available to OCLC catalog members as of July 2000. The Web site is www.oclc.org/oclc/corc/
As a case study of CORC in action, Suzanne Pilsk, a cataloger from the Smithsonian Institution, demonstrated some applications of CORC. The Smithsonian Institution Research Information System (SIRIS) has participated in a CORC project since 1 April 1999. SIRIS is the OPAC to the research resources held by the Institution's libraries, archives, and research units. The committee in SIRIS developed and redeveloped procedures, with the only constant being "Draft". Starting with home pages for the horticultural library, the committee catalogued the digitized records in MARC and CORC transferred the content to Dublin Core using similar fields. The team was able to cut and paste, then had a Webmaster embed the results. The project brought in staff reference librarians. Other branches and Web sites became involved, including the Stitch in Time project that covers sewing machine trade literature and digitized trade literature. The librarians had started the descriptions in Microsoft ACCESS and CORC transferred from the ACCESS program for the records. Images are in JPEG (Joint Photographic Experts Group). Her descriptions and examples showed practical applications of CORC. For more information on the SIRIS project, contact the speaker at email@example.com
The moderator, Hope Tillman, described what she felt was dead at her library at Babson College and challenged each of us to find similar items at our libraries. Her list included microfiche, microfilm, full text on cartridge, videocassettes, films, lantern slides, and aluminum disk recordings that require a cactus needle. Thus she set the stage for the three speakers to tabulate or to explain what they wished what were dead technologies.
Stephen Abrams, from IHS Micromedia LTD, started with graphics that refused to fit the screen. He noted that dot-coms are ill, that there is acquisition fever, balloon metaphors, venture capital, and domain hijackers. Bandwidth is another problem, for someone just made the speed of light go faster than the speed of light. Wireless, personal devices, and keyboards are gone. To free technology, we are giving up privacy for profiling, advertising, territory inclusions, and control. Soon sites will include advantage.com and iWon.com. On the dark side of free technology lie contacts for spammers and mergers like an AOL/Time Warner megamonopoly. For the solution to the Microsoft case, he suggests that the company move to Canada and merge with Thomson; then there would be no competition and we could have downloadable Molsons.
He continued with plausible patents, granting IBM one for paragraphs, Northern Lights for foldering, Priceline for Internet bartering, Amazon for one-click ordering, and Powerize for stickiness model. To relate to the library field, he said that there is patent potential in the business protocol of interlibrary loans, the display of catalog records, the MARC record, and the online book reserve. He felt that we are going from just-in-time to just-for-you and asked if we had really looked at Windows 2000, when, on day 12, it will remap your desktop based on your behaviors. He concluded that Adobe Acrobat and PDF (portable document format) were out.
Walt Howe, Internet consultant for Delphi Forum, opened by saying neckties were dead and we did not even know they were sick. He expounded on paper going out, being replaced with e-books. He described the Rocket ebook as an example, for it is readable under all light conditions, in bed, comfortable to hold, has changeable fonts, and is loadable on demand. Another example is the SLA Information Technology Division newsletter that is only available on the Web, saving postage. Digital cameras are replacing film. Filmless movies are multiplying and being distributed electronically. The floppy disk is near death, but how can we read yesterday's media tomorrow? Optical formats are changing, but what about archiving? Are we really back to paper? Or would engraving be better?
DVD may replace videotape. A number of schools are rushing into distance learning, but they must not repeat videotape program nightmares that were a one-way medium and not interactive. The rush to distance learning will leave a lot of dead technology along the way. The medium used in distance learning will provide much to talk about in future sessions.
Richard Hulser, from IBM, wrapped up the session by itemizing his choices for dead technologies, which included Internet search engines, access to bibliographic anything, and dot-com companies. Bytes per second will become megabytes, then gigabytes, next zedabytes, and finally yadabytes or xabytes. PCs and Apples by 2003 will be more diverse, more like pagers. Portals are alive, graphic interface dead.
Statements from the audience covered thoughts on interlibrary loans and undergraduate college libraries. Were they dead? The discussion focused on definitions and processes in interlibrary loan procedures, but no conclusions were reached.
The Engineering Division has sponsored the Roundtable for several years. The forum gives librarians and vendors the chance to learn about industrial standards and to share concerns with each other. This year the chair, Karen Kreisman Reczek, from ACTS Testing Labs, moderated a panel discussion among three librarians involved in standards. Karen McKinney is the knowledge management coordinator at Caterpillar; JoAnne Overman is at the National Center for Standards and Certification Information of the National Institute of Standards and Technology (NIST); and Janice Erickson is an information specialist with the Pacific Northwest National Laboratory operated by Battelle for the US Department of Energy.
After a brief description of the roles that each one plays in standards, Mrs Reczek asked them what sources they use. The answers ranged from Perinorm, the European database, to people, and included vendors like Information Handling Services (IHS) and ILI. High on the list of answers was NSSN (National Standards Service Network) found at http://www.nssn.org Also useful is ASSIST, the database for the US Department of Defense http://astimage.daps.dla.mil/ Even news alert services were mentioned, without citing specific names.
The panel found the question on evaluating resources hard to answer, for different sources perform different functions. For example, Global Engineering Documents is helpful in identifying and delivering standards. The Department of Energy, Technical Standards site http://www.eh.doe.gov/techstds/ helps with others. Sources are hard to evaluate for it depends on what you are looking for, and often all sources are checked. Some subscription sites may be more current than others; examples included IHS and ILI. Having the Code of Federal Regulations online helps; it is found at http://www.access.gpo.gov/nara/cfr Down time and outdated items handicaps the value to ASSIST. What would be nice is one search engine. Both free and fee sources are searched.
The next question dealt with the quality of the data. Format hinders quality. Raster scan image makes it difficult to get a good graphic. All sources could be improved by obliterating typos, making procurement less obscure, and improving the update cycle. Currency and accuracy were echoed by the third panelist, with pricing an issue. MacKenzie made the audience envious, for NSSN customized its database for Caterpillar with direct links to societies like American Gear Manufacturers Association (AGMA) and American Society for Testing and Materials (ASTM). Some panelists had Web subscriptions; some used a variety of vendors. NIST keeps bibliographic information on foreign standards; e-mail is firstname.lastname@example.org. Delivery led to a discussion on the value of vendors, for turnaround time was important. The engineer may want the library to download a standard, but neither has time to tie up the equipment. Price, availability, and update services were also factors in judging a vendor. One feature of the NSSN Web site at http://www.nssn.org is the search link at the bottom; valuable since the American National Standards Institute (ANSI) no longer publishes a hard copy of its catalog.
Other questions dealt with printing format and copyright. In most cases hard copy was preferred or PDF format. Contracts take too long. Some facilities do not have carte blanche for Internet access. With PDF comes capability for cut and paste.
The session closed with the question, how are standards evolving? Comments included better pricing structure for standards developing organizations (SDOs) and more customizing; for example, Caterpillar is charged the same price for SAE ground vehicle standards as SAE aerospace standards but quantity in the sets is not the same. Some prefer to pay by the document rather than subscribe to the entire run, known as pay by the drink. Others wanted more than one-stop shopping, especially with online links, and more filters; for example the automotive series on the IHS Worldwide Standards Service (WWSS) could be filtered out in many searches. Downloading and archiving are problems. There is a movement toward desktop delivery of electronic products.
Some suggestions from the audience were:
Use a book jobber as a source for procuring a standard.
Change the pricing structure to include five simultaneous users regardless of where located.
Have the vendors provide the capability to add statistics to justify use of a subscription at a site.
As always the roundtable ended with as many questions as at the beginning of the session.
Dublin Core, Metadata and More: The Annual SLA Technical Standards Update
Marge Hlava, chair of the Committee, gave a fast-paced, mind-boggling report on the activities of the committee for the last year. The report also served as an introduction to worldwide current technical standards activities. The charge of the committee is to read current drafts of technical standards that are up for review. Notification for most of the work is by e-mail. The e-mail list includes the current and past committee members and interested parties. Work on this committee is dedicated to technical standards, for another SLA committee deals with ethics.
Hlava described how parts of the puzzle come together in the standards process. The main players in the 1960s to 1980s centered on publishers, telecommunications, computer hardware and software. With the rise of the Internet in the 1990s, groups became more complicated, but all needed standards. Publishers branched from primary into secondary, like Chemical Abstracts Service and library OPACs (online public access catalogs). Add distributors who are vendors, too, like Ebsco and Dialog. Associations involved in standards making include Special Libraries Association (SLA) and American Society for Information Science (ASIS). Examples of bodies that develop standards are National Information Standards Organization (NISO), American National Standards Institute (ANSI), and International Organization for Standardization (ISO). New players, known as quasi standards organizations, are formed through commercial meetings, study groups, and interested alliances. Companies like OCLC have become primary and secondary publishers. New architecture is rising, like digital knowledge architecture and multimedia databases. Standards are needed for all of the above to work.
What are standards? They are mutually agreed on and concensus-voted documents. The process in standards making starts with identifying the need, drafting the standard, providing time for comments and resolution, and voting approval by members. Recently the index standard (NISO TR-02) failed, for the American Society of Indexers, among others, did not approve the draft. Official standards organizations include international bodies like ISO, national bodies, like ANSI, and subject areas like NISO. Voting members include members like SLA's Technical Standards Committee. For an excellent description of standards, go to the ISO Web site http://www.iso.ch/ The welcome page is a good introductory page that leads to more descriptive content by clicking on the appropriate box. NISO is found at http://www.niso.org This organization develops ANSI-approved standards in information services, such as Z39.50. The NISO secretariat has one vote to ISO through ANSI. It also reports to ISO though the technical committee TC46, Information and documentation. Under development are standards for the talking book and the book item and component identifier (BICI).
She sped through a discussion of information identifiers, with issues covering fast track and internationalization. The USA is no longer in the forefront of standardization. ISO adopted the International Standard Book Number (ISBN) and the International Standard Serial Number (ISSN), two standards that originated through a US committee. A non-US compliant example is the thesaurus standard that is British. Some are international only, such as Unicode and the International Standard for Audio-visual Number (ISAN). Power quasi standards are appearing; these include PDF (portable document format) and W3C. At the moment Transmission Control Protocol/Internet Protocol (TCP/IP) is not a standard, but a plea for comment. Others are agreed practices, like the "b" in Dialog searches.
The physical network involves standards. Issues include cabling, digital line speeds, Ethernet, and leased or T1 lines. Protocols involve mutually agreed bits. They are controlled by comment, but few are standards. Examples are RFC 791, the DARPA Internet Program Protocol Specification, and TCP/IP Unix protocol of choice. Other examples include applications such as mail, telnet, WWW, network protocol, such as HTTP, networks with IP numbers assigned by INTERNIC, and domain names. The body developing the standard is responsible for the maintenance of that standard. For example, Bowker maintains the ISBN.
Standards for primary publishers involve committees like the Book Industry Systems Advisory Committee (BISAC) and the digital object identifiers (DOI). DOI handles syntax, numbering, and deposited metadata. Standards for secondary publishers should control data about data. What is needed is the capability to cut and paste (called scraping and desalting).
Why do quasi groups exist? Because the normal standards pace is too slow. With the rise of global commerce, national models do not work. Unfortunately the USA is losing the lead in developing standards. There are costs to participation. Membership in standards developing bodies ranges from $1,500 to $10,000. The member agency gets, reads, and comments on a standard.
What is new?: ISAN is the audio-video number, similar to the ISBN. ISWC is the work code that involves authors and composers. There is work on electronic thesaurus, e-books, e-journals, and e-commerce. There are ANSIs out for review in these areas. Journal linking calls for xpointer/xlink, DOIs, PII, SICI and other variants. The standards process involves skills that reflect library and information science ones. Besides research, there are production skills and technical service roles.
Steve Schultz, from the American Institute of Astronautics and Aeronautics (AIAA), demonstrated a practical application to the management of standards. He organized a preferred collection that included a subset of all standards in the aerospace industry. The value-added features were the abstracts of standards, the alerting services, the customizing, and the pay-per-drink concept. The benefits covered savings on time and budget. He developed a market-driven product from an industry-initiated project by obliterating redundant, out-of-date, cancelled, and non-representative standards, while filling in the gaps. Through this project he identified the universe of standards by obtaining all in the field and gaining a buy-in with standards developing bodies. He established the review criteria through the general requirements and classifications. He developed the review structure in a hierarchical mode by using a task group of team leaders. After assembling the collection, he suggested developing a prototype that could be beta-tested before the initial release. He concluded that the product would benefit end-users, librarians, standards developing organizations, and industry as a whole. Although the product he described encompassed the aerospace field, it could serve as a model for most fields. For more information on the projected release date of this product, contact the Association at http://www.AIAA.org
Jean Z. Piety is Head, Science and Technology Department, Cleveland Public Library, Cleveland, Ohio. Jean.Piety@cpl.org