Inpainting forgery detection using hybrid generative/discriminative approach based on bounded generalized Gaussian mixture model

Abdullah Alharbi (College of Computers and Information Technology, Taif University, Taif, Saudi Arabia)
Wajdi Alhakami (College of Computers and Information Technology, Taif University, Taif, Saudi Arabia)
Sami Bourouis (College of Computers and Information Technology, Taif University, Taif, Saudi Arabia) (Université de Tunis El Manar, LR-SITI Laboratoire Signal, Image et Technologies de l'Information, Tunis, Tunisia)
Fatma Najar (Université de Tunis El Manar, LR-SITI Laboratoire Signal, Image et Technologies de l'Information, Tunis, Tunisia)
Nizar Bouguila (The Concordia Institute for Information Systems Engineering (CIISE), Concordia University, Montreal, Canada)

Applied Computing and Informatics

ISSN: 2634-1964

Article publication date: 29 July 2020

Issue publication date: 5 January 2024

801

Abstract

We propose in this paper a novel reliable detection method to recognize forged inpainting images. Detecting potential forgeries and authenticating the content of digital images is extremely challenging and important for many applications. The proposed approach involves developing new probabilistic support vector machines (SVMs) kernels from a flexible generative statistical model named “bounded generalized Gaussian mixture model”. The developed learning framework has the advantage to combine properly the benefits of both discriminative and generative models and to include prior knowledge about the nature of data. It can effectively recognize if an image is a tampered one and also to identify both forged and authentic images. The obtained results confirmed that the developed framework has good performance under numerous inpainted images.

Keywords

Citation

Alharbi, A., Alhakami, W., Bourouis, S., Najar, F. and Bouguila, N. (2024), "Inpainting forgery detection using hybrid generative/discriminative approach based on bounded generalized Gaussian mixture model", Applied Computing and Informatics, Vol. 20 No. 1/2, pp. 89-104. https://doi.org/10.1016/j.aci.2019.12.001

Publisher

:

Emerald Publishing Limited

Copyright © 2019, Abdullah Alharbi, Wajdi Alhakami, Sami Bourouis, Fatma Najar and Nizar Bouguila

License

Published in Applied Computing and Informatics. Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) license. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this license may be seen at http://creativecommons.org/licences/by/4.0/legalcode


1. Introduction

With the advancements in computer-related technologies more multimedia data are created in digital form allowing easy control over the handling, collection, and storage of such data. The rapid technological advances have amplified the amount of multimedia data generated every day. Analyzing these huge data is generally known as the BIG DATA analysis problem. This is especially true for visual data (images and videos) generated in the Web and social networks. These multimedia data can be used for example to discover digital evidences before, when, after a cybernetic security attack has occurred [1]. This scenario is often called digital forensics. The challenging problem in this case is the insurance of the completion of evidences which is considered one of the high priority principals that must be taken into account in any forensics investigation. Indeed, if some parts are deleted from original image, it will be very difficult to investigate forensically. Furthermore, any missing data can lead to erroneous investigation conclusions and make the evidences lose their credibility in the judiciary court. Today’s digital image forensics has become an emerging research field due to this big amount of generated digital image files [2,3] In this context, digital images and videos can be used for example for inpainting purposes. Indeed, image (resp. video) inpainting [4–6] is one of the interesting techniques used for data restoration that can restore lost and deteriorated information. Inpainting is known as a problem of filling the missing content in an image in order to repair a possible damage and manipulation. This research topic has positive impact for a variety of applications such as restoring old corrupted films and movies, producing new stories, improving the quality of noisy movies, etc.

Existing inpainting techniques can be divided into two categories: interpolation-based methods, and exemplar-based methods. The interpolation-based methods perform filling-in the areas by interpolation of the available data mostly using non-linear partial differential equations (PDEs) [7]. The exemplar-based methods on the other hand are very similar to the texture synthesis-based image inpainting methods, except that the texture (structure) sampling task is carried out by considering all the frames and also the moving objects, not only a single image as in the case of the texture synthesis-based image inpainting techniques [4]. Criminisi’s algorithm [4] is one of the most known algorithms among other exemplar-based inpainting approaches. Nevertheless, image inpainting techniques might be exploited to alter and delete content for malicious motives and can be used to generate the so-called forged image. Forgery aims at duplicating or hiding some parts in such image. Indeed, given the recent technology progress, it has become extremely easy to manipulate digital images and so it is difficult to guarantee their authenticity. In many cases, it is impossible for a viewer to judge the authenticity of a given image. Therefore, it is increasingly important to guarantee the integrity of the vast volumes of multimedia data before using them in many situations such as courts of law.

Providing efficient methods to digital forgery detection was and still a fundamental goal for researchers to establish the authenticity and origin of digital multimedia contents. Digital authenticity detection (e.g. inpainting, cloning, resampling, etc.) is becoming one of the attractive research areas in image (resp. video) processing [8] and this problem can be viewed as the problem of constantly distinguishing between forged and original real images (resp. videos). Figure 1 illustrate an example of copy-move forgery.

2. Related works and motivations

Basically, there are two major approaches for protecting digital multimedia data against tampering: active [10] and passive [11] approaches. For the case of active one, tampered region can be extracted using watermarking techniques while passive techniques have been developed in response to watermarking limitations.

In the recent past, some efforts have been devoted to solve this hot topic. Many of them are block-based methods that involve a feature extraction step to each block. Some effective block-based methods were developed to detect forgeries using robust features such as DCT, DWT, SVD (singular value decomposition), and PCA (Principal Component Analysis) [12–15,11]. For instance, DWT and SVD are combined and matched to detect duplicate regions in [15]. DCT is also used to extract features and to improve the detection precision in [13]. Zernike moments, which are invariant to rotations, are also applied in [16] to judge tampering. Other techniques exploit inconsistencies or unnatural high coherence observed in an image to detect duplicated elements. For instance in [17] authors proposed a method based on inspecting lighting conditions. In another effort [14], authors used the correlation characteristics between segments to detect duplicated regions. In [18] pattern noise is computed and then used to compare between the studied image and the reference pattern. Other works proposed to handle local transformations using keypoint-based methods like the scale-invariant feature transform (SIFT) features and descriptor algorithms, which are broadly applied in computer vision applications [19–21]. These algorithms are based on local features extraction and cost in general less time. In [19] authors proposed to apply the scale and rotation-invariant SIFT in order to extract and to match similar local features. Then, an agglomerative hierarchical clustering is performed to identify multiple cloned and forged regions. SIFT is also applied in [20] to determine keypoints and their visual features which are used in conjunction with RANSAC algorithm. An iterative detection algorithm was developed in [22] to localize duplicated regions based on both a keypoint-based and block-based methods so as to enhance the expected results. Furthermore, false matched regions are removed through a novel designed filter algorithm. So as to improve the copy-move forgery detection accuracy, some recent works proposed to take into account a segmentation step [23,24]. Indeed, in [23] the computational complexity is minimized by segmenting a suspicious image into independent patches. More specifically, SIFT and SURF are employed to identify salient points and the expectation–maximization (EM) algorithm is used to filter false patches and to increase the detection accuracy. In [9], interest points are extracted and clustered on the basis of some geometric constraints. Then, a multi-scale analysis process is utilized to examine the generated clusters and to detect duplicated regions. Even though most of the proposed methods are considered robust against blur and noise and can help in detecting forgery, however, most of them are computationally expensive, fail to handle images containing smooth and low contrast regions, and may not be able to detect some complex attacks (i.e inpainting forgery). Moreover, it is not expected to think about the possibility to exploit these methods and extend them for the case of video forgery detection.

Artificial intelligence and machine learning are a growing areas of research that offer potential benefits to solve difficult problems like forgery. Statistical approaches provide a formal way for image modelling and classification [25,26]. In particular, finite mixture models have attracted great interest among other approaches [25,27,28] [29]. Recently, some developed mixtures were applied successfully in the case of forgery detection problem [21,30]. In this context, we address the problem of image forgery detection by investigating recent developed mixture model named finite bounded generalized Gaussian mixtures (BGGMM) [31,32]. The consideration of BGGMM is encouraged by the fascinating results exposed recently and show this model as more effective for data classification and modeling than the conventional Gaussian mixtures [31,26]. Indeed, BGGMM has the advantage to fit data with different shapes defined within a bounded support and to maintain the goodness of fit. However, a crucial problem when we consider deterministic algorithms (such as the EM algorithm) to learning generative mixture models (in particular BGGMM) is their convergence to saddle points and their dependency on the initialization step [31,26,25].

3. A hybrid of bounded generalized gaussian mixture model and SVM

To cope with generative models limitations, one can think in using discriminative models instead of generative ones. Indeed, it has been shown in many contexts that discriminative classifiers (eg. SVM) are generally higher than generative models [33–35]. However, traditional discriminative classifiers fail also to reach high performances for all possible applications. In particular, the main problem with standard SVM kernels is that they are unable to consider the nature of training data and therefore they can not handle proportional data properly like the problem of image forgery. For all these reasons, we propose in this manuscript to develop a more powerful generative-discriminative approach based new implemented kernels. The key idea is to combine the strengths of both generative and discriminative models into one same hybrid framework. In particular, we intend to develop more appropriate and flexible SVM kernels from the generative bounded generalized Gaussian mixture model (BGGMM) for tampering detection. The ultimate goal is to improve the SVM capability in data forgery detection which is actually a challenging task and to achieve better performance when new hybrid learning approach is developed instead to motivate more this particular choice. To the best of our knowledge the generating of new SVM kernels from bounded generalized Gaussian mixtures and its application to the problem of image forgery detection have never been tackled before.

In the past, some standards SVM kernels (like linear kernel, RBF, polynomial) have been proposed [36] and it should be noted that these kernels pay no attention to the intrinsic structure of input data. As indicated in [37], if the selected kernel is constructed directly from data then this process yields to have more effective classification performances. Among the most valuable probabilistic kernels in this context we can cite especially the Information-based kernel (known also as Kulback-Leibler kernel), the Fisher kernel, and the Bhattacharyya-based kernel [37–39]. In the following sections we will expose our developed new probabilistic SVM kernels from bounded generalized Gaussian mixture model to deal with the forgery detection challenge.

3.1 Bounded generalized Gaussian mixture model

Although conventional Gaussian and generalized Gaussian mixture models have achieved acceptable results previously for data modelling and classification [40,41], nevertheless, their distributions are not bounded which limit their performances. Then again, a lot of real-life applications and sources have a bounded support. Consequently, considering this constraint when proceeding with mixture models will definitely enhance classification results and also the detection precision. In this work, we propose to investigate a flexible mixture model known as bounded generalized Gaussian models for forgery detection. This model has been proposed earlier in [31] for image segmentation. In this paper we propose to investigate the BGGMM in conjunction with discriminative approaches. The bounded generalized Gaussian distribution (BGGD) is considered among the recent successful finite mixture models for machine learning and pattern recognition applications [31,26] and it is proposed to generalize several other statistical distributions as such as the Gaussian, Uniform, and Laplacian distributions. It has the advantage to better fit different non-Gaussian shapes.

Let x=(X1,,XN) be a set of N images. Each image Xi is represented by an d-dimentional vector generated by a linear combination of M bounded generalized Gaussian distribution given as follows:

(1)p(Xi|Θ)=j=1Mpjψ(Xi|μj,σj,λj)
where pj is the mixing parameter that should satisfy for each component j(0pj1,j=1Mpj=1) and Θ=(μj,σj,λj) is the set of the parameters for the component j represented by a BGGD which is characterized by mean parameter μj=(μj1,,μjd), covariance matrix σj=(σj1,,σjd), and shaper parameter λj=(λj1,,λjd).
(2)ψ(Xi|μj,σj,λj)=p(Xi|μj,σj,λj)H(Xi|Ωj)δjp(Xi|μj,σj,λj)dx,
where p(Xi|μj,σj,λj) is the probability density function of the generalized Gaussian mixture model (GGMM) given by:
(3)p(Xi|μj,σj,λj)=k=1dF(λjk)exp(G(λjk)|Xikμjkσjk|λjk)
where

F(λjk)=λjk[Γ(3/λjk)Γ(1/λjk)]1/22σjkΓ(1/λjk),G(λjk)=[Γ(3/λjk)Γ(1/λjk)]λjk/2,Γ(.) is the gamma function, and H(Xi|Ωj) is the indicator function, δ is a bounded support region defined for each region Ωj).

(4)H(Xi|Ωj)={1IfXiδj0Otherwise

3.2 Generative model estimation

Given the set of images x, the complete log-likelihood corresponding to M mixture of BGG distributions is introduced as follows:

(5)L(x|Θ)=i=1Nlogj=1Mpjψ(Xi|μj,σj,λj)

The parameters of this model BGGMM are learned by maximizing the log likelihood using the expectation-maximization algorithm (Algo.1). During the expectation stage, the posterior probability (known also as responsibility) is calculated as:

(6)p(j|Xi)=pjψ(Xi|μj,σj,λj)m=1Mpmψ(Xi|μj,σj,λj)

Based on those calculated responsibility, the parameters of the mixture are determined in the maximization step and are updated according to:

(7)Θ(t+1)=argmaxΘL(x|Θ)
when maximizing the above equation, we obtain:
  • Mixing weight estimation:

(8)pj(t+1)=1Ni=1Np(j|Xi)
  • Mean estimation:

(9)μ^jk=i=1Np(j|Xi)(|Xikμjk|λjk2Xik+Tjk)i=1Np(j|Xi)|Xikμjk|λjk2
where:
(10)Tjk=m=1Msign(μjksmjk)|μjksmjk|λjk1H(smjk|Ωj)m=1MH(smjk|Ωj)
  • Covariance matrix estimation:

(11)σ^jk=[λjkA(λjk)i=1Np(j|Xi)|Xikμjk|λjki=1Np(j|Xi)(1+Qjk)]1/λjk
where
(12)Qjk=m=1M(1+λjkA(λjk)|smjkμjk|λjk(σjk)λjk)H(smjk|Ωj)m=1M H(smjk|Ωj)
  • Shape parameter estimation:

(13)λ^jk=λjk{2log[p(X|Θ)]λjk2+γ}1log[p(X|Θ)]λjk,
where γ denotes a scaling factor. 2log[p(X|Θ)]λjk2 and log[p(X|Θ)]λjk are detailed in [31].

3.3 Deriving SVM kernels from BGGMM

A crucial step for image forgery detection is visual features extraction. In this work we proceed with SURF (Speeded-Up Robust Features) descriptor [42] which is robust and invariant to geometric transformations. The proposed methodology is as follow: first, visual features are extracted using SURF detector. Each image is encoded as a vector of SURF features and then modeled through the finite mixture models (BGGMM). Secondly, kernel matrices are constructed on the basis of the mixture models. The generated kernels are Fisher information, Bhattacharyya kernels, and probabilistic-based distances between each of these mixture models. The SVM-based classifiers are developed to find the optimal estimated parameters. Thus, the main objective is to investigate the forgery detection results when using different SVM kernels while the BGGMM represent the core of these kernels. A flowchart summarizing and describing the proposed method is depicted in Figure 2. In the next subsections, we presented all necessary details regarding the different developed kernels.

3.3.1 Fisher kernels

The key intuition behind the Fisher kernel [37] is that it measures the similarity of two distributions (mixtures). Moreover, similar distributions imply same log-likelihood gradients in the mixture space. Thus, in order to construct Fisher kernel, we need to calculate the log-likelihood gradient of every distribution or mixture. In our case, we investigate the concept of the Fisher kernel for forgery detection since it is considered as a generic bridge between discriminative methods and probabilistic generative models and so it combines the benefits of both of them. In the following we replace the standard SVM kernels by developing our proper Fisher kernel based on the BGGMM.

(14)k(X,X)=UXtr(Θ)I1(Θ)UX(Θ)
(15)Lμjk=B(λjk)λjkσjkλjki=1NZij|Xikμjkλjk2
[Xik+μjkRj|Xikμjk|λjk2]
where
(16)Rj=m=1Msign(μjksmj)|smjμjk|λjk1H(smj|Ωj)m=1MH(smj|Ωj)
(17)Lσjk=σjk1i=1NZij[1+B(λjk)|Xikμjk|λjkλjkσjkλjkGj]
where
(18)Gj=m=1M(1+λjkB(λjk))|smjμjk|λjkσjkλjkH(smj|Ωj)m=1MH(smj|Ωj)
(19)Lλjk=i=1NZij[f(Xik|μj,σj,λj))δjψ(Xi|μj,σj,λj)f(Xik|μj,σj,λj)dxδjψ(Xj|μj,σj,λj)dx]
where
(20)f(Xik|μj,σj,λj)=[1λjk+3ψ(1/λjk)3ψ(3/λjk)2λjk2]B(λjk)|Xikμjkλjk|λjklog|Xikμjkλjk|B(λjk)
(21)Lpj=i=1N[p(j|Xi)pjp(1|Xi)p1],j=2,,M

3.3.2 Information divergence kernels

In order to combine the best of both discriminative methods and generative mixture models, we derive here a kernel distance based on the Kullback–Leibler (KL) divergence [38] between BGGMM mixtures. The key idea consists of constructing kernels on the basis of information divergence which is an interesting family allowing the measure of dissimilarity between different probability distributions and has direct connections to many existing probabilistic kernels. In this work, we opt for a common metric derived from the Kulback-Leibler (KL) divergence which is one of the essential quantities in machine learning. Hence, the dissimilarity between two probability distributions p(X|Θ1) and q(X|Θ2) is given as:

(22)k(p(X|Θ1),q(X|Θ2))=eAD(p(X|Θ1),q(X|Θ2))
where the factor A>0 is used for numerical stability reason, and
(23)D(p(X|Θ1),q(X|Θ2))=ωp(X|Θ1)logp(X|Θ1)q(X|Θ2)+q(X|Θ2)logq(X|Θ2)p(X|Θ1)

Given that it is not possible to have a closed form equation for the KL divergence kernel, therefore we opt for numerical approximation techniques such as the well-known Monte Carlo method [43,44].

(24)k(p(X|Θ1),q(X|Θ2))1Li=1Llogp(Xi|Θ1)q(Xi|Θ2)

3.3.3 Bhattacharyya kernels

In this section, we opt for another distance called “Bhattacharyya distance” [45], which allowing us the measure of the similarity between two probability density functions such as BGGMM. Then, we name the derived kernel from this distance as Bhattacharyya kernel. As it is impossible to estimate a closed form for the Bhattacharyya kernel for the bounded mixture model BGGMM which is intractable, therefore, we proceed like the case of Kullback-Leibler (KL) divergence by approximating a Bhattacharyya kernel using Monte Carlo simulation method [43,44].

(25)k12(X1,X2)=01p(X|Θ1)1/2q(X|Θ2)1/2dX
(26)k12(X1,X2)βN1i=1N1p1/2(Xi|Θ1)Z1p1/2(Xi|Θ1)+1βN2i=1N2q1/2(Xi|Θ2)Z2q1/2(Xi|Θ2)
where β[0,1] and the normalized factors Z1,Z2 are used for the densities p and q.

4. Experimental results

We have carried out experiments to assess the ability of the proposed approach in forgery detection problem. We present here comparative study with other well known mixture models which are broadly applied in the context of data classification. Please note that the purpose here is to evaluate our model against comparable methods and not against all state of the art algorithms which is out of the scope of this manuscript. The problem of image forgery detection can be seen as the issue of identifying forged images from authentic ones. It can be seen therefore as a classification problem.

4.1 Synthetic dataset

First of all, we begin by investigating the efficiency and the flexibility of our generative BGGMM-based method. The evaluation is done by employing two synthetic datasets that we have constructed. The first one represents a mixture of two bounded generalized Gaussian distributions (BGGD) and each mixture contains 3000 vectors in each cluster. The second dataset represents a mixture of three BGGD containing also 3000 vectors in each component. Our main focus here is to study the BGGMM’s performance. Both real and estimated parameters found using the BGGMM are given in Tables 1 and 2. Based on these tables, we find that the estimated values for different parameters are very close to the real ones. For example, the real and estimated values for the first cluster using the synthetic dataset1 are (μ=4,σ=0.1,λ=2) and (μ^=3.9,σ^=0.08,λ^=2.004) respectively. This experiment shows that the BGGMM can accurately estimate the mixture parameters and therefore can be used with success for forgery detection problem.

4.2 Forgery detection

We apply our approach to the challenging problem of image tampering detection. Our experiments are based on two public data sets: MICC-F220 and MICC-F2000 [19].

The first dataset is the MICC-F220 which is composed by 220 images: 110 are forged images and 110 originals. These images comprise for example plants, artifacts, and animals. The second publicly available dataset named MICC-F2000 is composed of 2000 images of which 700 are tampered and 1300 are authentic. In both datasets, the forged images are obtained by inserting a rectangle patch (copy-paste operation) on the original authentic images after performing some attacks like translation, scaling, and rotation. Sample images from these data sets are given in Figure 3 and Figure 4. In our experiments, the computed SVM kernels are trained using 10-fold cross-validation. Moreover, different number of components are considered to identify the suitable value as illustrated in Figure 5.

Results and comparative study w.r.t other methods from the state of the art, which are performed on the MICC-F220 dataset, can be viewed in Table 3. Quantitative results are determined in term of accuracy, TPR (true positive rate) and FPR (false positive rate) measures when applying different kernels generated from bounded generalized Gaussian mixture model. The accuracy metric represents the percentage of correctly classified images into normal or forged classes compared to the total number of images in the dataset. The used metric is defined as:

(27)TPR=TPTP+FNFPR=FPTN+FPAccuracy=percentageofcorrectlyclassifiedimagesintonormalorforgedclassestotalnumberofimages=TP+TNTP+FP+TN+FN
where TP,FN,FP,TN denote the number of true positives, false negatives, false positives and true negatives, respectively. The objective of this study is to evaluate the implemented probabilistic kernels and to compare them w.r.t some conventional generative mixture models. Accordingly, very encouraging results are obtained with our approach (see Table 3). Our proposed approach requires O(NK) operations for each iteration, where N represents the number of input images and K is the number of components in the mixture. Moreover, the complexity of training SVM is O(N3).

The first main deduction is that the proposed hybrid framework is able to increase the performance for all possible SVM kernels as compared to the conventional generative models such as Gaussian (GMM), generalized Gaussian (GGMM), and Bounded generalized Gaussian (BGGMM) models. In spite of the success of generative models – especially mixture models – for certain data classification problems, our current experiments have exposed the limits of generative models and indicated that hybrid models have shown substantial improvement in terms of detection accuracy. Therefore, there is an interest to develop hybrid approaches in order to yield better results. The second conclusion here is that the Fisher kernel generated from the bounded generalized Gaussian mixture model provides the best result. It bring great enhancement w.r.t other kernels in terms of accuracy which is well explained by the fact that this kernel is considered as an extension of the popular bag-of-visual-words (BOV) and might be calculated from much smaller vocabularies. It should also be noted that the best generative model that offers the best results is the bounded generalized Gaussian mixture model (BGGMM).

To further deepen our analysis of the developed approach, we apply it to another dataset: MICC-F2000. As in the previous experiment, the forgery detection is performed by using different methods as depicted in Table 4 and the best results are obtained for our developed hybrid method named BGGMM-fisher kernel (i.e while BGGMM is deployed with fisher kernel). It has the best accuracy with 81% than Gaussia, generalized Gaussian, Bounded GGMM, BGGMM-kulback-leibler, and BGGMM-Bhattacharyya kernel. In general, the proposed framework helps in obtaining good results for forgery detection as well illustrated in Figure 6 which summarizes the comparative study between different methods for both datasets MICC-F220 and MICC-F2000. According to this plot it is clear that proposed hybrid framework outperforms the generative statistical models for both datasets. This can be justified by the complexity of the datasets while the performance is improved using both discriminative and generative approaches.

5. Conclusion

We have developed a novel hybrid statistical approach based on both bounded generalized Gaussian mixture model (BGGMM) and SVM kernels. Experiments have concerned a challenging application called image forgery detection. Our choice for BGGMM is motivated by the fact that bounded models have better modeling capabilities than conventional unbounded Gaussian-based models, have the advantage to fit data with different shapes and allowing to avoid under and over-fitting. Therefore, they are considered more suitable for data classification and in particular forgery detection. On the other hand, our solution involves the developing of new probabilistic support vector machines (SVMs) kernels from the BGGMM. Thus the developed learning framework has the advantage to combine properly the benefits of both discriminative and generative models.

According to the experiments we can conclude that the developed probabilistic kernels are helpful to achieve good performances and they outperform generative models (such as GMM, GGMM and BGGMM) for the current problem. Future works could be devoted to extending the current framework by integrating a feature selection mechanism able to take into account a weight for each feature. The purpose is to improve more the results. Moreover, it is interesting to investigate and to validate the proposed framework on other related applications such as multimedia segmentation and image databases summarization. Another possible future direction is to evaluate other relevant local and may be global visual features and descriptors to select the most suitable ones in order to improve the expected results. Finally, we plan to consider other interesting bench marked datasets such as (FRITH) [49] to evaluate both the current and further works.

Figures

Inpainting copy-move forgery example (here the bird is fully hidden). Original image in the left and copy-move forgery in the right side [9].

Figure 1

Inpainting copy-move forgery example (here the bird is fully hidden). Original image in the left and copy-move forgery in the right side [9].

Flowchart of the proposed method.

Figure 2

Flowchart of the proposed method.

Samples images from MICC-F220. First row shows the original authentic examples. Second and third rows show tampered examples.

Figure 3

Samples images from MICC-F220. First row shows the original authentic examples. Second and third rows show tampered examples.

Samples images from MICC-F2000. First row shows the original authentic examples. Second and third rows show tampered examples.

Figure 4

Samples images from MICC-F2000. First row shows the original authentic examples. Second and third rows show tampered examples.

Accuracy rates using different number of components for MICC-F220 (Fisher kernel).

Figure 5

Accuracy rates using different number of components for MICC-F220 (Fisher kernel).

Comparative analysis of obtained results from different methods.

Figure 6

Comparative analysis of obtained results from different methods.

Real and estimated parameters of synthetic dataset 1 using the BGGMM.

Synthetic dataμσλ
X140.12
X220.51
Estimated dataμ^σ^λ^
X13.90.082.004
X21.920.441.92

Real and estimated parameters of synthetic dataset 2 using the BGGMM.

Synthetic dataμσλ
X110.32
X220.51
X30.511.5
Estimated dataμ^σ^λ^
X11.50.192.001
X21.860.671.9
X31.11.021.99

Accuracies (%) for forgery detection using different approaches for the dataset MICC-F220.

ModelsAccuracy(%)TPR (%)FPR (%)
Zernike [46]20.916.36
GoDeep [9]45.4541.82
Zandi [22]78.1848.18
Li [47]70.9117.27
Cozzolino [48]84.5517.27
GMM50.4554.5540.00
GGMM53.6464.0936.36
BGGMM55.4558.3345.94
BGGMM-Bhattacharyya kernel57.6558.9940.96
BGGMM-kulback-leibler60.1963.7652.20
BGGMM-fisher kernel80.9085.4217.85

Accuracies (%) for forgery detection using different approaches for the dataset MICC-F2000.

ModelsAccuracy (%)
GMM55.65
GGMM55.80
BGGMM56.00
BGGMM-kulback-leibler57.30
BGGMM-Bhattacharyya kernel67.86
BGGMM-fisher kernel81.00

References

[1]A. Almomani, M. Alauthman, F. AlBalas, O.M. Dorgham, A. Obeidat, An online intrusion detection system to cloud computing based on neucube algorithms, IJCAC 8 (2018) 96112.

[2]H. Vahdat-Nejad, S.O. Eilaki, S. Izadpanah, Towards a better understanding of ubiquitous cloud computing, Int. J. Cloud Appl. Comput. 8 (2018) 120.

[3]O. Dorgham, A. Almomani, M. Alauthman, F. Albalas, A. Obeidat, An online intrusion detection system to cloud computing based on neucube algorithms, Int. J. Cloud Appl. Comput. 8 (2018) 96112.

[4]A. Criminisi, P. Pérez, K. Toyama, Region filling and object removal by exemplar-based image inpainting, IEEE Trans. Image Processing 13 (2004) 12001212.

[5]J. Wu, Q. Ruan, Object removal by cross isophotes exemplar-based inpainting, in: 18th International Conference on Pattern Recognition, Hong Kong, China, 2006, pp. 810813.

[6]A. Wong, J. Orchard, A nonlocal-means approach to exemplar-based inpainting, in: Proceedings of the International Conference on Image Processing, San Diego, California, USA, 2006, pp. 26002603.

[7]B. Marcelo, S. Guillermo, C. Vincent, B. Coloma, Image inpainting, in: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques SIGGRAPH, 2000, pp. 417424.

[8]G.K. Birajdar, V.H. Mankar, Digital image forgery detection using passive techniques: a survey, Digital Investigation 10 (2013) 226245.

[9]E. Silva, T.J. de Carvalho, A. Ferreira, A. Rocha, Going deeper into copy-move forgery detection: exploring image telltales via multi-scale analysis and voting processes, J. Visual Commun. Image Represent. 29 (2015) 1632.

[10]C. Lu, H.M. Liao, Structural digital signature for image authentication: an incidental distortion resistant scheme, IEEE Trans. Multimedia 5 (2003) 161173.

[11]A.C. Popescu, H. Farid, Exposing digital forgeries by detecting traces of resampling, IEEE Trans. Signal Processing 53 (2005) 758767.

[12]C.S. Prakash, A. Kumar, S. Maheshkar, V. Maheshkar, An integrated method of copy-move and splicing for image forgery detection, Multimedia Tools Appl. 77 (2018) 2693926963.

[13]H. Wang, H. Wang, X. Sun, Q. Qian, A passive authentication scheme for copy-move forgery based on package clustering algorithm, Multimedia Tools Appl. 76 (2017) 1262712644.

[14]A.J. Fridrich, B.D. Soukal, A.J. Lukas, Detection of copy-move forgery in digital images, in: Proceedings of Digital Forensic Research Workshop, 2003.

[15]G. Li, Q. Wu, D. Tu, S. Sun, A sorted neighborhood approach for detecting duplicated regions in image forgeries based on DWT and SVD. in: Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, ICME 2007, July 2–5, 2007, Beijing, China, pp. 17501753.

[16]S. Ryu, M. Lee, H. Lee, Detection of copy-rotate-move forgery using zernike moments, in: Information Hiding - 12th International Conference, IH 2010, Calgary, AB, Canada, June 28–30, 2010, Revised Selected Papers, pp. 5165.

[17]M.K. Johnson, H. Farid, Exposing digital forgeries in complex lighting environments, IEEE Trans. Inf. Forensics Security 2 (2007) 450461.

[18]J. Lukás, J.J. Fridrich, M. Goljan, Digital camera identification from sensor pattern noise, IEEE Trans. Information Forensics Security 1 (2006) 205214.

[19]I. Amerini, L. Ballan, R. Caldelli, A.D. Bimbo, G. Serra, A sift-based forensic method for copy-move attack detection and transformation recovery, IEEE Trans. Inf. Forensics Security 6 (2011) 10991110.

[20]X. Pan, S. Lyu, Region duplication detection using image feature matching, IEEE Trans. Inf. Forensics Security 5 (2010) 857867.

[21]S. Bourouis, M.A. Mashrgy, N. Bouguila, Bayesian learning of finite generalized inverted dirichlet mixtures: application to object classification and forgery detection, Expert Syst. Appl. 41 (2014) 23292336.

[22]M. Zandi, A.M. Aznaveh, A. Talebpour, Iterative copy-move forgery detection based on a new interest point detector, IEEE Trans. Inf. Forensics Security 11 (2016) 24992512.

[23]J. Li, X. Li, B. Yang, X. Sun, Segmentation-based image copy-move forgery detection scheme, IEEE Trans. Inf. Forensics Security 10 (2015) 507518.

[24]C. Pun, X. Yuan, X. Bi, Image forgery detection using adaptive oversegmentation and feature point matching, IEEE Trans. Inf. Forensics Security 10 (2015) 17051716.

[25]G. McLachlan, D. Peel, Finite mixture models, Wiley and Sons, New York, 2000.

[26]I. Channoufi, S. Bourouis, N. Bouguila, K. Hamrouni, A flexible statistical model for image denoising, in: Image Analysis and Recognition – 15th International Conference, ICIAR 2018, Póvoa de Varzim, Portugal, June 27–29, 2018, Proceedings, pp. 3038.

[27]F. Najar, S. Bourouis, N. Bouguila, S. Belghith, A fixed-point estimation algorithm for learning the multivariate GGMM: application to human action recognition, in: 2018 IEEE Canadian Conference on Electrical & Computer Engineering, CCECE 2018, Quebec, QC, Canada, May 13–16, 2018, pp. 14.

[28]I. Channoufi, S. Bourouis, N. Bouguila, K. Hamrouni, Color image segmentation with bounded generalized gaussian mixture model and feature selection, in: 4th International Conference on Advanced Technologies for Signal and Image Processing, ATSIP 2018, Sousse, Tunisia, March, 2018, pp. 21–24, pp. 16.

[29]F. Najar, S. Bourouis, A. Zaguia, N. Bouguila, S. Belghith, Unsupervised human action categorization using a riemannian averaged fixed-point learning of multivariate GGMM, in: Image Analysis and Recognition – 15th International Conference, ICIAR 2018, Póvoa de Varzim, Portugal, June 27–29, 2018, Proceedings, pp. 408415.

[30]S. Bourouis, F.R. Al-Osaimi, N. Bouguila, H. Sallay, F.M. Aldosari, M.A. Mashrgy, Bayesian inference by reversible jump MCMC for clustering based on finite generalized inverted dirichlet mixtures, Soft Comput. 23 (2019) 57995813.

[31]T.M. Nguyen, Q.J. Wu, H. Zhang, Bounded generalized gaussian mixture model, Pattern Recogn. 47 (2014) 31323142.

[32]I. Channoufi, S. Bourouis, N. Bouguila, K. Hamrouni, Spatially constrained mixture model with feature selection for image and video segmentation, in: Image and Signal Processing – 8th International Conference, ICISP 2018, Cherbourg, France, July 2–4, 2018, Proceedings, pp. 3644.

[33]V.N. Vapnik, V. Vapnik, Statistical learning theory, vol. 1, Wiley, New York, 1998.

[34]C.M. Bishop, Pattern recognition and machine learning, Springer, 2006.

[35]H. Sallay, S. Bourouis, Intrusion detection alert management for high-speed networks: current researches and applications, Security Commun. Networks 8 (2015) 43624372.

[36]V. Vapnik, The nature of statistical learning theory, Springer Science & Business Media (2013).

[37]T.S. Jaakkola, D. Haussler, Exploiting generative models in discriminative classifiers, in: Advances in Neural Information Processing Systems 11, [NIPS Conference, Denver, Colorado, USA, November 30 – December 5, 1998], pp. 487493.

[38]P.J. Moreno, P.P. Ho, N. Vasconcelos, A kullback-leibler divergence based kernel for svm classification in multimedia applications, in: Advances in neural information processing systems, 2004, pp. 13851392.

[39]C.H. You, K. Lee, H. Li, GMM-SVM kernel with a bhattacharyya-based distance for speaker recognition, IEEE Trans. Audio Speech Language Process. 18 (2010) 13001312.

[40]X. Yang, S.M. Krishnan, Image segmentation using finite mixtures and spatial information, Image Vision Comput. 22 (2004) 735745.

[41]M.S. Allili, N. Bouguila, D. Ziou, Finite general gaussian mixture modeling and application to image and video foreground segmentation, J. Electronic Imaging 17 (2008) 013005.

[42]H. Bay, A. Ess, T. Tuytelaars, L.V. Gool, Speeded-up robust features (surf), Comput. Vis. Image Underst. 110 (2008) 346359.

[43]N. Bouguila, Deriving kernels from generalized dirichlet mixture models and applications, Inf. Process. Manage. 49 (2013) 123137.

[44]A.B. Chan, N. Vasconcelos, P.J. Moreno, A family of probabilistic kernels based on information divergence, Univ. California, San Diego, CA, Tech. Rep. SVCL-TR-2004-1, 2004.

[45]T. Jebara, R. Kondor, Bhattacharyya and expected likelihood kernels, in: Learning Theory and Kernel Machines: in Proc. of the 16th Annual Conference on Learning Theory (COLT), Springer, 2003, pp. 5771.

[46]V. Christlein, C. Riess, J. Jordan, C. Riess, E. Angelopoulou, An evaluation of popular copy-move forgery detection approaches, IEEE Trans. Inf. Forensics Security 7 (2012) 18411854.

[47]Y. Li, J. Zhou, A. Cheng, X. Liu, Y.Y. Tang, SIFT keypoint removal and injection via convex relaxation, IEEE Trans. Inf. Forensics Security 11 (2016) 17221735.

[48]D. Cozzolino, G. Poggi, L. Verdoliva, Efficient dense-field copy-move forgery detection, IEEE Trans. Inf. Forensics Security 10 (2015) 22842297.

[49]P. Rosin, FRITH Dataset, accessed December 1, 2019.

Acknowledgements

The authors would like to thank the Deanship of Scientific Research at Taif University, Kingdom of Saudi Arabia, for their funding support under Grant No.: 1-439-6116.Declaration of Competing Interest: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.Publishers note: The publisher wishes to inform readers that the article “Inpainting forgery detection using hybrid generative/discriminative approach based on bounded generalized Gaussian mixture model” was originally published by the previous publisher of Applied Computing and Informatics and the pagination of this article has been subsequently changed. There has been no change to the content of the article. This change was necessary for the journal to transition from the previous publisher to the new one. The publisher sincerely apologises for any inconvenience caused. To access and cite this article, please use Alharbi, A., Alhakami, W., Bourouis, S., Najar, F., Bouguila, N. (2019), “Inpainting forgery detection using hybrid generative/discriminative approach based on bounded generalized Gaussian mixture model”, Applied Computing and Informatics. Vol. ahead-of-print No. ahead-of-print. https://10.1016/j.aci.2019.12.001. The original publication date for this paper was 19/12/2019.

Corresponding author

Abdullah Alharbi can be contacted at: amharbi@tu.edu.sa

Related articles