The purpose of this paper is to assess the reliability of numerical ratings of hotels calculated by three sentiment analysis algorithms.
More than one million reviews and numerical ratings of hotels in seven cities in four countries were extracted from TripAdvisor web site. Reviews were classified as positive or negative using three sentiment analysis tools. The percentage of positive reviews was used to predict numerical ratings that were then compared with actual ratings.
All tools classified reviews as positive or negative in a way that correlated positively with numerical ratings. More complex algorithms worked better, yet predicted ratings showed reasonable agreement with actual ratings for most cities. Predictions for hotels were less reliable if based on less than 50-60 percent of available reviews.
These results validate that sentiment analysis can be used to transform unstructured qualitative data on user opinion into quantitative ratings. Current tools may be useful for summarizing opinions of user reviews of products and services on web sites that do not require users to post numerical ratings such as traveler forums. This summarizing may be valuable not just to potential users, but also to the service and product providers and offers validation and benchmarking for future improvement of opinion mining and prediction techniques.
This work assesses the correlation between sentiment analysis of hotels’ reviews and their actual ratings. The authors also evaluated the reliability of results of sentiment analysis calculated by three different algorithms.
López Barbosa, R., Sánchez-Alonso, S. and Sicilia-Urban, M. (2015), "Evaluating hotels rating prediction based on sentiment analysis services", Aslib Journal of Information Management, Vol. 67 No. 4, pp. 392-407. https://doi.org/10.1108/AJIM-01-2015-0004Download as .RIS
Emerald Group Publishing Limited
Copyright © 2015, Emerald Group Publishing Limited