Books and journals Case studies Expert Briefings Open Access
Advanced search

Effective techniques for automatic extraction of Web publications

A.C.M. Fong (A.C.M. Fong works at the Institute of Information and Mathematical Sciences of Massey University, Auckland, New Zealand.)
S.C. Hui (S.C. Hui is an Associate Professor at the School of Computer Engineering at Nanyang Technological University, Singapore.)
H.L. Vu (H.L. Vu is a Research Student, at the School of Computer Engineering at Nanyang Technological University, Singapore.)

Online Information Review

ISSN: 1468-4527

Publication date: 1 February 2002

Abstract

Research organisations and individual researchers increasingly choose to share their research findings by providing lists of their published works on the World Wide Web. To facilitate the exchange of ideas, the lists often include links to published papers in portable document format (PDF) or Postscript (PS) format. Generally, these publication Web sites are updated regularly to include new works. While manual monitoring of relevant Web sites is tedious, commercial search engines and information monitoring systems are ineffective in finding and tracking scholarly publications. Analyses the characteristics of publication index pages and describes effective automatic extraction techniques that the authors have developed. The authors’ techniques combine lexical and syntactic analyses with heuristics. The proposed techniques have been implemented and tested for more than 14,000 Web pages and achieved consistently high success rates of around 90 percent.

Keywords

  • Internet
  • Research
  • Electronic publishing
  • Content analysis

Citation

Fong, A.C.M., Hui, S.C. and Vu, H.L. (2002), "Effective techniques for automatic extraction of Web publications", Online Information Review, Vol. 26 No. 1, pp. 4-18. https://doi.org/10.1108/14684520210418347

Download as .RIS

Publisher

:

MCB UP Ltd

Copyright © 2002, MCB UP Limited

Please note you do not have access to teaching notes

You may be able to access teaching notes by logging in via Shibboleth, Open Athens or with your Emerald account.
Login
If you think you should have access to this content, click the button to contact our support team.
Contact us

To read the full version of this content please select one of the options below

You may be able to access this content by logging in via Shibboleth, Open Athens or with your Emerald account.
Login
To rent this content from Deepdyve, please click the button.
Rent from Deepdyve
If you think you should have access to this content, click the button to contact our support team.
Contact us
Emerald Publishing
  • Opens in new window
  • Opens in new window
  • Opens in new window
  • Opens in new window
© 2021 Emerald Publishing Limited

Services

  • Authors Opens in new window
  • Editors Opens in new window
  • Librarians Opens in new window
  • Researchers Opens in new window
  • Reviewers Opens in new window

About

  • About Emerald Opens in new window
  • Working for Emerald Opens in new window
  • Contact us Opens in new window
  • Publication sitemap

Policies and information

  • Privacy notice
  • Site policies
  • Modern Slavery Act Opens in new window
  • Chair of Trustees governance statement Opens in new window
  • COVID-19 policy Opens in new window
Manage cookies

We’re listening — tell us what you think

  • Something didn’t work…

    Report bugs here

  • All feedback is valuable

    Please share your general feedback

  • Member of Emerald Engage?

    You can join in the discussion by joining the community or logging in here.
    You can also find out more about Emerald Engage.

Join us on our journey

  • Platform update page

    Visit emeraldpublishing.com/platformupdate to discover the latest news and updates

  • Questions & More Information

    Answers to the most commonly asked questions here