To read this content please select one of the options below:

Automated classification of HTML forms on e‐commerce web sites

Yanbo Ru (Department of Computer Science, University of Southern California, Los Angeles, California, USA)
Ellis Horowitz (Department of Computer Science, University of Southern California, Los Angeles, California, USA)

Online Information Review

ISSN: 1468-4527

Article publication date: 14 August 2007

978

Abstract

Purpose

Most e‐commerce web sites use HTML forms for user authentication, new user registration, newsletter subscription, and searching for products and services. The purpose of this paper is to present a method for automated classification of HTML forms, which is important for search engine applications, e.g. Yahoo Shopping and Google's Froogle, as they can be used to improve the quality of the index and accuracy of search results.

Design/methodology/approach

Describes a technique for classifying HTML forms based on their features. Develops algorithms for automatic feature generation of HTML forms and a neural network to classify them.

Findings

The authors tested their classifier on an e‐commerce data set and a randomly retrieved data set and achieved accuracy of 94.7 and 93.9 per cent respectively. Experimental results show that the classifier is effective and efficient on both test beds, suggesting that it is a promising general purpose method.

Originality/value

The paper is of value to those involved with information management and e‐commerce.

Keywords

Citation

Ru, Y. and Horowitz, E. (2007), "Automated classification of HTML forms on e‐commerce web sites", Online Information Review, Vol. 31 No. 4, pp. 451-466. https://doi.org/10.1108/14684520710780412

Publisher

:

Emerald Group Publishing Limited

Copyright © 2007, Emerald Group Publishing Limited

Related articles