Free or fee? The Web versus Internet commercial subscription services

Education + Training

ISSN: 0040-0912

Article publication date: 1 November 1999

96

Citation

Thackray, J. (1999), "Free or fee? The Web versus Internet commercial subscription services", Education + Training, Vol. 41 No. 8. https://doi.org/10.1108/et.1999.00441hag.001

Publisher

:

Emerald Group Publishing Limited

Copyright © 1999, MCB UP Limited


Free or fee? The Web versus Internet commercial subscription services

Free or fee? The Web versus Internet commercial subscription services

Can the Internet be a useful information resource for the practising manager and professional? Undoubtedly the Internet is a huge repository of data. Leaving aside that which is held in newsgroup archives and other pre-Web formats, the World Wide Web (Web) alone is now considered to amount to 800 million plus pages of data (Lawrence and Lee, 1999). Why then, with this huge data resource, is its usefulness even questionable? What issues exist for managers and professionals in making effective use of this data?

Finding useful materials easily amongst any 800 million pages of data is likely to be difficult and, in the absence of the sophisticated, universal classification systems which apply in library archives, a nightmare. Generating large numbers of data "hits" is all too easy. Finding valid information quickly, much less so. Three basic search options are open to the researcher: directories, such as Yahoo! http://www.yahoo.co.uk/ search engines such as AltaVista < ahref="http://www.altavista.com/">http://www.altavista.com/ and Goggle http://www.google.com/ and specialised commercial Internet information services such as Northern Light http://www.northernlight.com/ and HR-Expert www.hr-expert.com.

Directories are multi-dimensional lists of sites built and maintained by people. Web site addresses are placed in subject-based cascading tree structures. Information can be found by moving down the branches or through keyword searches which look within these lists. Directories are relatively easy to use but cover limited amounts of what is available and can be subject to classification problems. What you want may be there but if it is not classified where you logically think it should be, then it may be lost to you.

Search engines are built by software robots or spiders. These constantly trawl the Web referencing new and changed pages. AltaVista uses a program called Scooter, which each day merges ten million new or changed pages into its database of Web pages. Search engines can be extremely useful and effective in practised hands - see Gresham, K., "Surfing with a purpose: process and strategy put to the test", Educom Review September/October 1998 http://www.educause.edu/ir/library/html/erm9851.html for a useful introduction. However, effective usage requires patience and practice on the part of researchers, yet most of the practising managers I meet as students seem to prefer to waste time ploughing through huge lists of "hits", rather than expend time learning search skills.

However, the performance of search engines should not be exaggerated, even in skilled hands. Managers want relevancy not volume. Some search engines seek to filter and rank hits by closeness of match to the search string used. Others rank sites by the number of links which point to them from other sites, a sort of Web citation index. However, a major review of search engine performance (Internet World, May 1996) concluded:

The most striking conclusion we drew from our tests was that all these engines had a long way to go before they could be relied on to deliver consistently accurate search findings. Each one delivered a high proportion of irrelevant information when challenged beyond a simple search on a well-represented topic.

Three years is a long time in Internet terms and filtering techniques will have improved. However, the best filtering algorithms cannot filter what they cannot find and even the major search engines and directories only index a fraction of the Web's 800 million pages: AltaVista (15.5 per cent); HotBot (11.3 per cent); Yahoo! (7.4 per cent); Excite (5.6 per cent) (Lawrence and Lee, 1999). Plus, they may be being denied access to the most useful sites. The "spiders" used to index pages consume considerable bandwidth and many commercial sites now practice "infestation" control by using software to ban spiders from their sites.

There remains a final problem: validity or quality of content. Robert Wilensky's (University of California) comment - "We've heard that a million monkeys at a million keyboards could produce the Complete Works of Shakespeare; now, thanks to the Internet, we know this is not true" - encapsulates the lack of control over content that is the Internet; anyone can publish anything. The citation ranking approach mentioned above may help here but it is a little rough and ready. Within the boundaries of their own expertise, practising managers and professionals are capable of judging the quality of the Web's free resources but it might not be seen as a cost-effective use of their time. The commercial model is to pay. HR professionals (or their organisations) might sensibly prefer to purchase speedy access to quality controlled materials from an Internet commercial subscription service. Two alternatives would appear to be available in this respect: services which essentially add value to the existing Web database; and services which eschew Web data altogether and deliver proprietary information using the Internet and Web browser technologies as a delivery medium.

Northern Light is a generic example of the former and it adds value in two ways: by adding 5,400 full text (digitised print) sources to the Web; and by categorising what it finds into custom search folders. Its service offers free access to the Web plus account or pay-as-you go access to its "special collection" text sources. As one of the larger indexes of the Web (16 per cent) it would appear to offer something over and above the traditional free search engines.

HR-Expert is an example of the second category: a proprietary content, annual subscription service for the HR profession. It offers content specifically created for the subject area, plus access to articles and news clippings. Additionally, subscribers receive very regular news updates by automated e-mail. In essence it is a development of the traditional loose-leaf insert service but now offered over the Internet. Its content divides very roughly into two sections. First, "The Expert", which contains the specific content, in the form of practical, prescriptive advice useful to working managers: documents, forms, policies, procedures, checklists, etc. Second, "The Research Centre", which contains commentary materials in the form of articles and news clippings. Further restriction is allowed within these two sections. Returned references are also organised most effectively to assist the user, through a dual sorting by "Category/Section" (Discipline and Grievance, Employee Rights, Reward Management etc.), and by "Type" (Practical Guide, Fast Track, Policies, Procedures, Checklists etc.). This "Type" classification would appear to be particularly useful to the practising manager needing "prescriptive" assistance on problem areas.

On the assumption that commercial subscription services offer ease of use plus quality controlled information in return for a fee, do these two examples outperform searching the Web's free resources? Whilst a proper trial was beyond the scope of this column, this contention was tested by the author and three of his Postgraduate Diploma in Personnel Management students who were researching topics for their professional level Management Report assignments. The two topic areas were: "Harassment and bullying at work" and "Reflective practice in management development".

Using simple but sensible search strings with a variety of search engines and directories produced large numbers of hits. These tools were also sufficiently sophisticated to place the most obviously relevant hits at the top of the lists. For example, using the simple search string "harassment bullying work", it was soon clear when hits were appearing related to harassment and bullying in school. Multiple referencing of the different pages of large sites was an annoyance and very many of the same references were returned by different search engines. Directly useful links were returned but serendipity seemed to be the major force at work here. In the main, initial hits tended to offer useful background or the possibility of directly relevant information down the line through the following of links. Many links led only to commercial sites offering a little information as marketing "come-ons" for reports, services or training. Only occasionally were references returned to substantive pieces from clearly authoritative sources. The majority of hits were on US-based sites, not necessarily an issue with many research topics, but obviously a problem with UK/European research topics or those with legislative connotations. This is avoidable since some search engines are dedicated to geographical areas and the larger general ones have methods for restricting hits to domains (e.g., ".uk") but this takes us away from simple usage. Overall, the student researchers were not disappointed with the returns, since they came away with considerable amounts of potentially useful data to gut, and each topic area produced one major find. However, the manager with a deadline requiring directly relevant information on a topic might be better advised to consult "Croner" before the Web.

Did the commercial Internet-based subscription services outperform the Web? Northern Light, with its combination of a large Web index plus additional proprietary content, was explored first. It returned a large number of Web references, many of which, not unexpectedly, duplicated material already found and overall the same general issues with Web results were apparent. However, two distinct benefits over the generality of Web search tools soon manifested themselves. First, annoying multiple hits on the same site were not returned. Very usefully, a single reference was returned with the option of opening up other pages on the site which matched the search string. Second, returns were classified into folders, allowing the user to differentiate and speed up his/her search. For example, ignoring those filed under "Commercial sites" might eliminate those offering little more than marketing "puffs", whilst going directly to "Employment law" could take the manager to the nub of the issue. Searches using Northern Light invariably returned some non-free, "Special collection" items. Users could view an abstract plus information on the material: source, author, publication date, size, cost etc. A money-back guarantee is offered if the materials turns out not to be useful. However, the two topic queries supplied only returned a total of nine special collection items: five US; three UK; and one EU. Somewhat limited even if the quality is high (which could not be ascertained as none were purchased).

The first three hits provided by "The Expert" database gave clear and authoritative guidance on sexual, racial and general harassment/bullying. Opening the "Type" classification category "Policies" offered as the first hit a "model policy on harassment at work". Impressive but not perfect, since looking under "Checklists" returned 27 hits, none of which seemed particularly relevant to the original search question. Not a major issue since returns are given a percentage relevance score and when the 50 per cent point is reached, hits tend to be somewhat peripheral and could be ignored. One has to ask, could not a cut-off-point be enforced (or allowed) below which hits are not returned? Interrogation of the database of articles produced an impressive list and those with high percentage relevance scores looked to offer quality commentary on the subject area. Articles are available both online in HTML format and as Adobe PDF documents for original quality viewing and printing. HR-Expert did much less well when interrogated on "reflective practice" as a method in management development. In fact "The Expert" failed to return a single hit. The "Research Centre" database performed marginally better, delivering 14 articles (rather less than the 2,000 plus of the previous search). Does this suggest the service is more geared to the management side of HR rather than the development side or was the service just unlucky in terms of our second topic? The quality of information found did, however, again seem to be high.

It would be wrong to draw very firm conclusions from the simple tests above but some comments are possible. The Web as a free resource performed as expected. Simple but logical search strings return material useful to the topic area but time and effort is generally required to find the specific information required, and judgements on validity is squarely placed upon the user. Northern Light certainly assisted in making speedier decisions on which of the Web's returns were worth accessing through its placing of these in folders. However, its US bias and the generic nature of the data held in its special collection makes the "paid-for" part of its service of less obvious value for the HR manager. HR-Expert performed excellently on one of the topic areas: literally minutes only were required to generate authoritative prescriptive and commentary data. Its relatively poor performance on the second topic can be viewed from two perspectives. First, those subscribing to fee-paying services should be clear as to what is on offer and what they are purchasing, and companies providing these will need to respond quickly to fill any gaps if they wish to retain subscribers. Second, it was useful to know quickly what little was available on the topic from the service. One could go elsewhere without wasting time, one of the annoying aspects of simple Web searching is that one is seldom presented with no/few returns and time has to be wasted discovering nothing useful is available.

John ThackrayInternet editor

ReferenceLawrence, S. and Lee, C.L. (1999), "Assembly of information on the Web", Nature , Vol. 400, p.107.

Related articles