TY  - JOUR
AB  - Purpose– The existence and continued growth of the invisible web creates a major challenge for search engines that are attempting to organize all of the material on the web into a form that is easily retrieved by all users. The purpose of this paper is to identify the challenges and problems underlying existing work in this area.Design/methodology/approach– A discussion based on a short survey of prior work, including automated discovery of invisible web site search interfaces, automated classification of invisible web sites, label assignment and form filling, information extraction from the resulting pages, learning the query language of the search interface, building content summary for an invisible web site, selecting proper databases, integrating invisible web‐search interfaces, and accessing the performance of an invisible web site.Findings– Existing technologies and tools for indexing the invisible web follow one of two strategies: indexing the web site interface or examining a portion of the contents of an invisible web site and indexing the results.Originality/value– The paper is of value to those involved with information management.
VL  - 29
IS  - 3
SN  - 1468-4527
DO  - 10.1108/14684520510607579
UR  - https://doi.org/10.1108/14684520510607579
AU  - Ru Yanbo
AU  - Horowitz Ellis
PY  - 2005
Y1  - 2005/01/01
TI  - Indexing the invisible web: a survey
T2  - Online Information Review
PB  - Emerald Group Publishing Limited
SP  - 249
EP  - 265
Y2  - 2024/04/25
ER  -