To read the full version of this content please select one of the options below:

Gradient boosting learning for fraudulent publisher detection in online advertising

Deepti Sisodia (Department of Computer Science and Engineering, National Institute of Technology, Raipur, India)
Dilip Singh Sisodia (Department of Computer Science and Engineering, National Institute of Technology, Raipur, India)

Data Technologies and Applications

ISSN: 2514-9288

Article publication date: 17 November 2020

Issue publication date: 12 April 2021

145

Abstract

Purpose

Analysis of the publisher's behavior plays a vital role in identifying fraudulent publishers in the pay-per-click model of online advertising. However, the vast amount of raw user click data with missing values pose a challenge in analyzing the conduct of publishers. The presence of high cardinality in categorical attributes with multiple possible values has further aggrieved the issue.

Design/methodology/approach

In this paper, gradient tree boosting (GTB) learning is used to address the challenges encountered in learning the publishers' behavior from raw user click data and effectively classifying fraudulent publishers.

Findings

The results demonstrate that the GTB effectively classified fraudulent publishers and exhibited significantly improved performance as compared to other learning methods in terms of average precision (60.5 %), recall (57.8 %) and f-measure (59.1%).

Originality/value

The experiments were conducted using publicly available multiclass raw user click dataset and eight other imbalanced datasets to test the GTB's generalizing behavior, while training and testing were done using 10-fold cross-validation. The performance of GTB was evaluated using average precision, recall and f-measure. The performance of GTB learning was also compared with eleven other state-of-the-art individual and ensemble classification models.

Keywords

Acknowledgements

Ethical approval: This article does not contain any studies with human participants or animals performed by any of the authors.Conflict of interest: All authors declare that they have no conflict of interest.

Citation

Sisodia, D. and Sisodia, D.S. (2021), "Gradient boosting learning for fraudulent publisher detection in online advertising", Data Technologies and Applications, Vol. 55 No. 2, pp. 216-232. https://doi.org/10.1108/DTA-04-2020-0093

Publisher

:

Emerald Publishing Limited

Copyright © 2020, Emerald Publishing Limited

Related articles