Gradient boosting learning for fraudulent publisher detection in online advertising
Data Technologies and Applications
ISSN: 2514-9288
Article publication date: 17 November 2020
Issue publication date: 12 April 2021
Abstract
Purpose
Analysis of the publisher's behavior plays a vital role in identifying fraudulent publishers in the pay-per-click model of online advertising. However, the vast amount of raw user click data with missing values pose a challenge in analyzing the conduct of publishers. The presence of high cardinality in categorical attributes with multiple possible values has further aggrieved the issue.
Design/methodology/approach
In this paper, gradient tree boosting (GTB) learning is used to address the challenges encountered in learning the publishers' behavior from raw user click data and effectively classifying fraudulent publishers.
Findings
The results demonstrate that the GTB effectively classified fraudulent publishers and exhibited significantly improved performance as compared to other learning methods in terms of average precision (60.5 %), recall (57.8 %) and f-measure (59.1%).
Originality/value
The experiments were conducted using publicly available multiclass raw user click dataset and eight other imbalanced datasets to test the GTB's generalizing behavior, while training and testing were done using 10-fold cross-validation. The performance of GTB was evaluated using average precision, recall and f-measure. The performance of GTB learning was also compared with eleven other state-of-the-art individual and ensemble classification models.
Keywords
Acknowledgements
Ethical approval: This article does not contain any studies with human participants or animals performed by any of the authors.Conflict of interest: All authors declare that they have no conflict of interest.
Citation
Sisodia, D. and Sisodia, D.S. (2021), "Gradient boosting learning for fraudulent publisher detection in online advertising", Data Technologies and Applications, Vol. 55 No. 2, pp. 216-232. https://doi.org/10.1108/DTA-04-2020-0093
Publisher
:Emerald Publishing Limited
Copyright © 2020, Emerald Publishing Limited