To read this content please select one of the options below:

Prediction of polarities of online hotel reviews: an improved stacked decision tree (ISD) approach

Shrawan Kumar Trivedi (Department of Management, Rajiv Gandhi Institute of Petroleum Technology, Jais, Amethi, India)
Amrinder Singh (Department of Accounting and Finance, Indian Institute of Management Sirmaur, Paonta Sahib, India)
Somesh Kumar Malhotra (Department of Electronics and Communication, University Institute of Engineering and Technology, Chhatrapati Shahu Ji Maharaj University, Kanpur, India)

Global Knowledge, Memory and Communication

ISSN: 2514-9342

Article publication date: 4 April 2022

Issue publication date: 20 November 2023

181

Abstract

Purpose

There is a need to predict whether the consumers liked the stay in the hotel rooms or not, and to remove the aspects the customers did not like. Many customers leave a review after staying in the hotel. These reviews are mostly given on the website used to book the hotel. These reviews can be considered as a valuable data, which can be analyzed to provide better services in the hotels. The purpose of this study is to use machine learning techniques for analyzing the given data to determine different sentiment polarities of the consumers.

Design/methodology/approach

Reviews given by hotel customers on the Tripadvisor website, which were made available publicly by Kaggle. Out of 10,000 reviews in the data, a sample of 3,000 negative polarity reviews (customers with bad experiences) in the hotel and 3,000 positive polarity reviews (customers with good experiences) in the hotel is taken to prepare data set. The two-stage feature selection was applied, which first involved greedy selection method and then wrapper method to generate 37 most relevant features. An improved stacked decision tree (ISD) classifier) is built, which is further compared with state-of-the-art machine learning algorithms. All the tests are done using R-Studio.

Findings

The results showed that the new model was satisfactory overall with 80.77% accuracy after doing in-depth study with 50–50 split, 80.74% accuracy for 66–34 split and 80.25% accuracy for 80–20 split, when predicting the nature of the customers’ experience in the hotel, i.e. whether they are positive or negative.

Research limitations/implications

The implication of this research is to provide a showcase of how we can predict the polarity of potentially popular reviews. This helps the authors’ perspective to help the hotel industries to take corrective measures for the betterment of business and to promote useful positive reviews. This study also has some limitations like only English reviews are considered. This study was restricted to the data from trip-adviser website; however, a new data may be generated to test the credibility of the model. Only aspect-based sentiment classification is considered in this study.

Originality/value

Stacking machine learning techniques have been proposed. At first, state-of-the-art classifiers are tested on the given data, and then, three best performing classifiers (decision tree C5.0, random forest and support vector machine) are taken to build stack and to create ISD classifier.

Keywords

Citation

Trivedi, S.K., Singh, A. and Malhotra, S.K. (2023), "Prediction of polarities of online hotel reviews: an improved stacked decision tree (ISD) approach", Global Knowledge, Memory and Communication, Vol. 72 No. 8/9, pp. 765-778. https://doi.org/10.1108/GKMC-12-2021-0197

Publisher

:

Emerald Publishing Limited

Copyright © 2022, Emerald Publishing Limited

Related articles