The purpose of this paper is to investigate the search behavior of institutional repository (IR) users in regard to subjects as a means of estimating the potential impact of applying a controlled subject vocabulary to an IR.
Google Analytics data were used to record cases where users arrived at an IR item page from an external web search and subsequently downloaded content. Search queries were compared against the Faceted Application of Subject Terminology (FAST) schema to determine the topical nature of the queries. Queries were also compared against the item’s metadata values for title and subject using approximate string matching to determine the alignment of the queries with current metadata values.
A substantial portion of successful user search queries to an IR appear to be topical in nature. User search queries matched values from FAST at a higher rate than existing subject metadata. Increased attention to subject description in IR records may provide an opportunity to improve the search visibility of the content.
The study is limited to a particular IR. Data from Google Analytics does not provide comprehensive search query data.
The study presents a novel method for analyzing user search behavior to assist IR managers in determining whether to invest in applying controlled subject vocabularies to IR content.
Hanrath, S. and Radio, E. (2017), "User search terms and controlled subject vocabularies in an institutional repository", Library Hi Tech, Vol. 35 No. 3, pp. 360-367. https://doi.org/10.1108/LHT-11-2016-0133
Emerald Publishing Limited
Copyright © 2017, Emerald Publishing Limited