A methodology for classification and validation of customer datasets
Journal of Business & Industrial Marketing
ISSN: 0885-8624
Article publication date: 28 September 2020
Issue publication date: 25 May 2021
Abstract
Purpose
The purpose of this paper is to develop a method to classify customers according to their value to an organization. This process is complicated by the disconnected nature of a customer record in an industry such as insurance. With large numbers of customers, it is of significant benefit to managers and company analysts to create a broad classification for all customers.
Design/methodology/approach
The initial step is to construct a full customer history and extract a feature set suited to customer lifetime value calculations. This feature set must then be validated to determine its ability to classify customers in broad terms.
Findings
The method successfully classifies customer data sets with an accuracy of 90%. This study also discovered that by examining the average value for key variables in each customer segment, an algorithm can label the group of clusters with an accuracy of 99.3%.
Research limitations/implications
Working with a real-world data set, it is always the case that some features are unavailable as they were never recorded. This can impair the algorithm’s ability to make good classifications in all cases.
Originality/value
This study believes that this research makes a novel contribution as it automates the classification of customers but in addition, the approach provides a high-level classification result (recall and precision identify the best cluster configuration) and detailed insights into how each customer is classified by two validation metrics. This supports managers in terms of market spend on new and existing customers.
Keywords
Acknowledgements
This research work was funded by Science Foundation Ireland under grant numbers: SFI/12/RC/2289 and SFI/12/RC/2289-P2.
The authors would also like to acknowledge the insightful feedback and level of detail provided by the anonymous reviewers.
Citation
Nie, D., Cappellari, P. and Roantree, M. (2021), "A methodology for classification and validation of customer datasets", Journal of Business & Industrial Marketing, Vol. 36 No. 5, pp. 821-833. https://doi.org/10.1108/JBIM-02-2020-0077
Publisher
:Emerald Publishing Limited
Copyright © 2020, Emerald Publishing Limited