To read this content please select one of the options below:

Applications of n‐grams in textual information systems

Alexander M. Robertson (Department of Computer Science, University of Sheffield, Western Bank, Sheffield, S10 2TN)
Peter Willett (Department of Information Studies, University of Sheffield, Western Bank, Sheffield, S10 2TN)

Journal of Documentation

ISSN: 0022-0418

Article publication date: 1 March 1998

731

Abstract

This paper provides an introduction to the use of n‐grams in textual information systems, where an n‐gram is a string of n, usually adjacent, characters extracted from a section of continuous text. Applications that can be implemented efficiently and effectively using sets of n‐grams include spelling error detection and correction, query expansion, information retrieval with serial, inverted and signature files, dictionary look‐up, text compression, and language identification.

Keywords

Citation

Robertson, A.M. and Willett, P. (1998), "Applications of n‐grams in textual information systems", Journal of Documentation, Vol. 54 No. 1, pp. 48-67. https://doi.org/10.1108/EUM0000000007161

Publisher

:

MCB UP Ltd

Copyright © 1998, MCB UP Limited

Related articles