To read this content please select one of the options below:

Bayesian-optimized extreme gradient boosting models for classification problems: an experimental analysis of product return case

Biplab Bhattacharjee (Information Systems and Analytics Area, Indian Institute of Management, Shillong, India and Jindal Global Business School, O.P. Jindal Global University, Sonepat, India)
Kavya Unni (Department of Management, Amrita Vishwa Vidyapeetham − Amritapuri Campus, Amritapuri, India)
Maheshwar Pratap (Department of Management, Amrita Vishwa Vidyapeetham − Amritapuri Campus, Amritapuri, India)

Journal of Systems and Information Technology

ISSN: 1328-7265

Article publication date: 3 September 2024

Issue publication date: 15 November 2024

59

Abstract

Purpose

Product returns are a major challenge for e-businesses as they involve huge logistical and operational costs. Therefore, it becomes crucial to predict returns in advance. This study aims to evaluate different genres of classifiers for product return chance prediction, and further optimizes the best performing model.

Design/methodology/approach

An e-commerce data set having categorical type attributes has been used for this study. Feature selection based on chi-square provides a selective features-set which is used as inputs for model building. Predictive models are attempted using individual classifiers, ensemble models and deep neural networks. For performance evaluation, 75:25 train/test split and 10-fold cross-validation strategies are used. To improve the predictability of the best performing classifier, hyperparameter tuning is performed using different optimization methods such as, random search, grid search, Bayesian approach and evolutionary models (genetic algorithm, differential evolution and particle swarm optimization).

Findings

A comparison of F1-scores revealed that the Bayesian approach outperformed all other optimization approaches in terms of accuracy. The predictability of the Bayesian-optimized model is further compared with that of other classifiers using experimental analysis. The Bayesian-optimized XGBoost model possessed superior performance, with accuracies of 77.80% and 70.35% for holdout and 10-fold cross-validation methods, respectively.

Research limitations/implications

Given the anonymized data, the effects of individual attributes on outcomes could not be investigated in detail. The Bayesian-optimized predictive model may be used in decision support systems, enabling real-time prediction of returns and the implementation of preventive measures.

Originality/value

There are very few reported studies on predicting the chance of order return in e-businesses. To the best of the authors’ knowledge, this study is the first to compare different optimization methods and classifiers, demonstrating the superiority of the Bayesian-optimized XGBoost classification model for returns prediction.

Keywords

Citation

Bhattacharjee, B., Unni, K. and Pratap, M. (2024), "Bayesian-optimized extreme gradient boosting models for classification problems: an experimental analysis of product return case", Journal of Systems and Information Technology, Vol. 26 No. 4, pp. 495-527. https://doi.org/10.1108/JSIT-06-2020-0120

Publisher

:

Emerald Publishing Limited

Copyright © 2024, Emerald Publishing Limited

Related articles