The application of statistical inversion theory provides a powerful approach for solving estimation problems including the ability for uncertainty quantification (UQ) by means of Markov chain Monte Carlo (MCMC) methods and Monte Carlo integration. This paper aims to analyze the application of a state reduction technique within different MCMC techniques to improve the computational efficiency and the tuning process of these algorithms.
A reduced state representation is constructed from a general prior distribution. For sampling the Metropolis Hastings (MH) Algorithm and the Gibbs sampler are used. Efficient proposal generation techniques and techniques for conditional sampling are proposed and evaluated for an exemplary inverse problem.
For the MH-algorithm, high acceptance rates can be obtained with a simple proposal kernel. For the Gibbs sampler, an efficient technique for conditional sampling was found. The state reduction scheme stabilizes the ill-posed inverse problem, allowing a solution without a dedicated prior distribution. The state reduction is suitable to represent general material distributions.
The state reduction scheme and the MCMC techniques can be applied in different imaging problems. The stabilizing nature of the state reduction improves the solution of ill-posed problems. The tuning of the MCMC methods is simplified.
The paper presents a method to improve the solution process of inverse problems within the Bayesian framework. The stabilization of the inverse problem due to the state reduction improves the solution. The approach simplifies the tuning of MCMC methods.
Neumayer, M., Suppan, T. and Bretterklieber, T. (2019), "Statistical solution of inverse problems using a state reduction", COMPEL - The international journal for computation and mathematics in electrical and electronic engineering, Vol. 38 No. 5, pp. 1521-1532. https://doi.org/10.1108/COMPEL-12-2018-0500
Emerald Publishing Limited
Copyright © 2019, Markus Neumayer, Thomas Suppan and Thomas Bretterklieber.
Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode
Inverse problems are referred to as highly dimensional and ill-posed estimation problems (Tarantola, 2004), where information about a state ϕ should be inferred from noisy measurements d̃∈ℝM. In electrical engineering the inverse problems of electrical impedance tomography (EIT) (Holder, 2005), electrical resistivity tomography (ERT) (Scott and McCann, 2005) and electrical capacitance tomography (ECT) (Jaworski and Dyakowski, 2001) form prominent inverse problems. In ECT the spatial dielectric material distribution has to be estimated from electrical measurements. A common approach is to use the discretization D : ϕ ↦ x, yielding to the state vector x ∈ ℝN, which has to be estimated. The ill-posed nature makes the solution of inverse problems a hard problem. A root cause for the ill-posed nature can be found by the fact, that for many ill-posed problems the number of unknowns N exceeds the number of measurements M by far. The stable solution of ill-posed inverse problems requires the incorporation of prior knowledge (Kaipio and Somersalo, 2005) or the application of regularization techniques (Vogel, 2002). The distinction between prior knowledge and regularization is made between the fields of statistical inversion theory and deterministic inversion theory.
Statistical inversion theory is based on a probabilistic modeling approach of the underlying measurement process (Watzenig and Fox, 2009). All variables are considered to be multivariate random variables, which are modeled by means of probability density functions (pdfs). Specifically, the prior distribution π(x) and the likelihood function π(d̃|x) have to be formed. The likelihood function π(d̃|x) is formed from a noise model and a model about the measurement process P:ϕ↦d̃, which we refer to as forward map F(x). The posterior distribution π(x|d̃)∝π(d̃|x)π(x) is formed using Bayes law. Given this formulation different estimation algorithms can be designed like the maximum a posteriori (MAP) estimator. The MAP estimate is given at the maximum of the posterior distribution. It can be computed by means of numerical optimization schemes. A review for different reconstruction methods is given in (Cui et al., 2016).
Figure 1 sketches the Bayesian scheme. The photographes included in Figure 1 depict an actual ECT reconstruction experiment. Given the true material distribution x in the image space, the measurement process provides the data d ∈ ℝM in the data space, which is corrupted by measurement noise leading to the data d̃. The gray ellipse around d̃ in Figure 1 illustrates the uncertainty of the measurement process in the data space. Subsequently also the reconstruction result exhibits a certain degree of uncertainty. This uncertainty is illustrated by the gray ellipse in the image space. For the characterization of the solution space and uncertainty quantification (UQ), estimators like the conditional mean xCM = E(X) or the variance ΣX = E ((X – μCM)(X – xCM)T) can be used, where E(·) denotes the expectation operator. Hereby the posteriori distribution π(x|d̃) has to be estimated (Fox and Nicholls, 1997). The evaluation of the expectation can be done by means of Monte Carlo integration:
Model reduction techniques are an approach to reduce the computational load (Kaipio and Somersalo, 2007). Hereby the computational expensive forward map is replaced by an approximated forward map (Lipponen et al., 2013). A different strategy is the reduction of the problem dimension by means of a state reduction scheme (Neumayer et al., 2017). The full state vector x is constructed by a projection PNR:xR ↦ x, where the reduced state vector xR∈ℝNR is of significant lower dimension than the full state vector x. In Neumayer et al. (2017), the construction of a linear state reduction scheme, i.e. x = PNRxR, using prior knowledge is presented.
In this paper, we present the use of a state reduction within MCMC methods. We show the versatile applicability of the state reduction scheme for general distributions and present the construction of an MH-algorithm and a Gibbs sampler. We further demonstrate the stabilizing effect of the state reduction, i.e. we will demonstrate the capability of image reconstruction without a dedicated prior distribution.
This paper is structured as follows. In Section 2, we briefly introduce the inverse problem of ECT and discuss the applied state reduction scheme. We show the general applicability of the technique and address the stabilizing properties of the state reduction. In Section 3, we show two MCMC algorithms and discuss the adoptions for the inverse problem of ECT. In Section 4, we present results of a numerical study.
2. Inverse problem of electrical capacitance tomography and state reduction
2.1 Electrical capacitance tomography
In this section, we briefly discuss the ECT measurement process, leading to the ECT forward map F, which simulates the measurement process. Figure 1 includes a sketch of an ECT sensor. The sensor consists of a nonconductive pipe, with Nelec electrodes mounted at the outer circumference. The sensor is shielded by a screen. The capacitances between the electrodes depend on the dielectric material distribution ϕ inside the pipe, which is referred to as region of interest Ω ROI. The capacitances between the electrodes can be determined by applying an AC signal to an electrode and measuring the displacement currents at the remaining electrodes. This process is repeated for all electrodes, leading to independent measurements. We denote the measurement process by P:ϕ ↦ d̃, where d̃∈ℝM are the noisy measurements in the data space. The simulation of the measurement process is based on solving the potential equation ∇ · (ε0εr∇V) = 0 for the corresponding boundary conditions and computing the capacitances. V is the scalar potential and ε0εr is the permittivity. In this work we will use the finite element method for the simulation. It is common to use the finite elements inside ΩROI for the discretization D: ϕ ↦ x. Calibration strategies are used to compensate model errors between the forward map F and the measurement process P.
2.2 Prior-based state reduction
This subsection briefly discusses the construction of the prior based state reduction of form x = PxR, where xR is the reduced state vector. Detailed information can be found in Neumayer et al. (2017). The construction of the state reduction is based on the concept of a sample based prior distribution, where samples for possible material distributions are generated by means of a random number generator. Figure 2 depicts exemplary samples. Hereby Gaussian bumps have been placed arbitrarily inside Ω ROI. Computing a sufficiently large number of these samples and storing them by means of the matrix Xpattern gives a sample based representation of the prior distribution. Then the matrix P can be constructed by , where the vectors ui are the column vectors of the matrix U from the singular value decomposition (SVD) U ΣVT = XPattern of the matrix XPattern. NR denotes the dimension of the state reduction. Using this scheme, the state reduction is typically capable to reduce the dimension of the estimation problem to about 10 per cent of the original size (Neumayer et al., 2017). This has been found by an analysis of error due to the approximation and by the trend of the singular values of the SVD. However, in a visual comparison often a smaller state reduction can be found to be sufficient.
2.3 State reduction for different distributions
A concern with respect to the application of the state reduction is the capability to represent material distributions, which differ from the prior distribution used for the creation of the state reduction. Figure 3 presents an exemplary study. Figure 3(a) shows four material distributions, which deviate from the prior distribution of Gaussian bumps. This holds particularly for the third and the fourth distribution, but all samples also have sharp boundaries, which is a significant deviation from the Gaussian bumps. Figure 3(b) depicts a constraint least squares approximation of the distribution using the state reduction constructed from Gaussian bumps. The state reduction is capable to represent the material distributions with sufficient accuracy. We hereby refer to the typical image quality of soft field tomography systems, e.g. see Figure 1. The dimensions of the full state vectors for this example is N = 606. The dimension of the reduced state vector is 6 per cent of the dimension of the full state vector. Hence, even a better reduction ratio is possible.
2.4 Stabilization of inverse problem
A remarkable feature of the state reduction is depicted in Figure 4(a). We computed the reduced representation for the samples of the matrix XPattern and formed a Gaussian summary statistics by means of the mean vector μXR and the covariance matrix . CΣXR is the Cholesky decomposition of ΣXR. Then we created random samples for the full state vector x by , where v is a random vector of corresponding dimension, which is drawn for a standard multivariate Gaussian distribution, i.e. v ∝N(0, I). Figure 4(a) depicts four samples produced by this scheme. We added a positive constant to fix the lower bound to 1 for each sample. For comparison Figure 4(a) depicts samples, which have been created by the same procedure, but for the full state vector. The samples depicted in Figure 4(a) show feasible material distributions, whereas the samples depicted in Figure 4(b) show no structure within the material distribution.
The state reduction scheme carries the intrinsic information about the material distribution. Hence, the application of the state reduction incorporates prior information, but by means of the matrix P instead of a dedicated prior π(PxR). Hence, the state reduction stabilizes the inverse problem. We will use this in our numerical study, where we do not maintain a dedicated prior distribution for the solution of the inverse problem.
3. Markov chain Monte Carlo methods for Bayesian inference
In this section, we briefly address two MCMC algorithms and discuss the adoptions for the inverse problem of ECT, which are the Metropolis Hastings (MH) algorithm (Hastings, 1970) and the Gibbs-sampler (Geman and Geman, 1984). The purpose of MCMC methods is to create samples of a target distribution π(x|d̃) by constructing a Markov chain X. Then the samples can be used to evaluate expectations by means of Monte Carlo integration given by equation (1).
3.1 Metropolis Hastings algorithm
The MH-algorithm is given by:
Pick the current state x = Xn of the Markov chain.
With proposal density q(x, x′) generate a new state x′.
Compute the acceptance probability
With probability α accept x′ and Xn+1 = x′, otherwise reject x′ and set Xn+1 = x.
The MH-algorithm is simple to implement, yet it is known to have drawbacks for large-scale inverse problems. For every proposal x′, the forward map has to be evaluated, which causes high computational costs, in particular when a large number of proposals are rejected. Researchers have stated acceptance rates of 20 to 30 per cent as typical values for large-scale inverse problems (Kaipio et al., 2000). For the proposal generation using the state reduction scheme, we maintain the column vectors of the matrix P, which leads to:
3.2 Gibbs sampler
The Gibbs sampler is given by:
Pick the current state x = Xn from the Markov chain.
For every element i of the vector x:
Set xi of x as independent variable, while all other elements are fixed; and
Draw from the conditional distribution.
Set Xn+1 = x.
While in the MH-algorithm proposal candidates can be rejected, the Gibbs sampler draws a sample from the conditional distribution in every iteration. Executing the Gibbs sampler over all elements of the state vector is referred to as a scan, which then produces a new sample for the Markov chain. Using the relation (2), the conditional distribution has to be evaluated for support points within the feasible range of a. This requires a simulation of the forward map for each support point, which results in the high computational costs of the algorithm. The application of the state reduction scheme enables a reduction of the computational costs by the same ration as the state reduction.
3.2.1 Sampling from the conditional distribution.
Given the non-parametric representation of the conditional distribution a sample has to be generated for the scaling variable a. Figure 5 depicts some ensembles of the logarithm of the conditional distribution for our later problem. Note, that the individual distributions have been scaled to be depicted within a single plot. For each trend Ncond. s. = 10 support points have been evaluated.
Drawing a sample from the non-parametric representation can be done by different schemes like inverse transform sampling, the MH-algorithm, or dedicated sampling techniques like rejection sampling (Richardson et al., 1996). The later schemes require additional computational effort. In inverse transform sampling the scaling parameter a is computed by solving:
For a, where u is drawn from the distribution u(0,1). The application of inverse transform sampling requires the evaluation of the integral of the conditional distribution. Considering the non-parametric description of the conditional distributions, the accurate evaluation of inverse transform sampling is critical. Considering the trends of the logarithm of the conditional distribution as depicted by the ensembles in Figure 5, we propose to use a quadratic approximation for the logarithm of the conditional distribution. The parameters for the quadratic approximation can be estimated by means of a least squares fit using the pseudo-inverse. Given the approximation, the relation:
4. Numerical study
In this section, we present reconstruction experiments for material distributions depicted in Figure 3(a). We simulate an ECT-sensor with Nelec = 16 electrodes, which leads to M = 120 independent measurements. For the reconstruction, we use a mesh with about N = 606 finite elements within the ROI. The state reduction is again of dimension NR = 36. The simulation data d̃ have been generated on a mesh with about N = 2000 finite elements within the ROI to avoid inverse crime data. The simulated data have been corrupted by additive i.i.d. white Gaussian noise. The signal to noise ratio (SNR) had a lower value of approximately 50 dB. The likelihood function is given by:
4.1 Reconstruction results
Figure 7 depicts the estimated mean and the standard deviation for the reconstruction experiments, which have been computed by means of Monte Carlo integration from the Markov chains. Data from a burn in phase have been neglected for the evaluation. The results depicted in Figure 7 are the results obtained for the Gibbs sampler. The results for the MH-algorithm show no difference. Figure 7(a) shows the estimated conditional mean; Figure 7(b) shows the uncertainty of the results by means of the standard deviation. All material distributions can be reconstructed using the proposed approach. The standard deviation gives information about the uncertainty of the estimated results. It can be seen that the uncertainty increases at the center of the pipe and the boundaries of the objects. This behavior corresponds to the soft field nature of the electric field used for sensing in ECT. The good reconstruction behavior proofs the stabilizing properties of the state reduction, as the inverse problem is solved without a dedicated prior.
4.2 Analysis of the Markov chains
In this subsection, we analyze the behavior of the Markov chains. Figure 8 depicts the logarithm of the posteriori distribution for the MH-algorithm and the Gibbs sampler. From the trends, it can be seen, that both algorithms reach convergence after a short burn in phase. This can be seen by the noise-like behavior of the signals. For the MH-algorithm, we obtained a typical acceptance rate in the range of 75 per cent. This is remarkable considering the simple proposal generator used for this reconstruction experiment.
For a more detailed analysis, we studied the correlation behavior of the Markov chains using the method presented in (Wolff, 2004). Figure 9 depicts an exemplary analysis result for the second reconstruction experiment, using the Gibbs sampler. We evaluated the integrated auto correlation time (IACT) τIACT, which is a measure for the number of iterations to obtain an independent sample. As depicted in Figure 9, the IACT for the Gibbs sampler is τIACT,Gibbs = 25. For the MH-algorithm, we obtained an IACT of τIACT,MH = 832. For the MH-algorithm the IACT corresponds to the number of forward map evaluations NF to obtain an independent sample. For the Gibbs-sampler the number of forward map evaluations NF is given by FF = NR · τIACT,Gibbs · Ncond.s, where Ncond.s. is the number of support points for the conditional sampling. The numbers reveal the computational costs of employing MCMC methods. Table I lists the IACTs for the different reconstruction experiments. Yet the good convergence of the chains indicates the benefit of the state reduction. Assuming the same IACT the computational costs for a Gibbs sampler, which operates on the full state space is increased by the ratio of N/NR. The computational overhead of both algorithms is minimal due to the simple sampling schemes, which holds in particular for the Gibbs sampler, where the inverse transform sampling is applied.
In this work, we demonstrated the application of a state reduction scheme within MCMC algorithms for the statistical solution of inverse problems. The state reduction is constructed by means of a sampling based prior. We demonstrated the versatility of the state reduction for different material distributions. We presented the implementation of a MH-algorithm and a Gibbs sampler using the state reduction and showed the successful reconstruction of different material distribution, including an uncertainty quantification. For the MH-algorithm, the state reduction allows the implementation of a simple proposal generator leading to high acceptance rates. For the Gibbs sampler, we presented an inverse transform sampling scheme to sample from the conditional distribution. A remarkable property of the state reduction is its stabilizing capability for the solution of ill-posed inverse problems, which we demonstrated by neglecting a dedicated prior in our reconstruction experiments. The results were presented for the inverse problem of ECT, yet the procedure can be applied for other inverse problems.
Evaluation or the IACT for the four reconstruction experiments
Brooks, S., Gelman, A., Jones, G. and Meng, X.L. (2011), Handbook of Markov Chain Monte Carlo, CRC press.
Cui, Z., Wang, Q., Xue, Q., Fan, W., Zhang, L., Cao, Z., Sun, B., Wang, H. and Yang, W. (2016), “A review on image reconstruction algorithms for electrical capacitance/resistance tomography”, Sensor Review, Vol. 36 No. 4, pp. 429-445.
Fox, C. and Nicholls, G. (1997), Sampling Conductivity Images via MCMC, University of Leeds, p. 91.
Geman, S. and Geman, D. (1984), “Stochastic relaxation, gibbs distributions, and the bayesian restoration of images”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-6 No. 6, pp. 721-741.
Hastings, W. (1970), “Monte carlo sampling using markov chains and their applications”, Biometrika, Vol. 57 No. 1, pp. 97-109.
Holder, D.S. (2005), Electrical Impedance Tomography: Methods, History and Applications, Institute of Physics Publishing, Philadelphia.
Jaworski, A.J. and Dyakowski, T. (2001), “Application of electrical capacitance tomography for measurement of gas-solids flow characteristics in a pneumatic conveying system”, Measurement Science and Technology, Vol. 12 No. 8, pp. 1109-1119.
Kaipio, J. and Somersalo, E. (2005), Statistical and Computational Inverse Problems, Vol. 160, Applied Mathematical Sciences. Springer.
Kaipio, J. and Somersalo, E. (2007), “Statistical inverse problems: discretization, model reduction and inverse crimes”, Journal of Computational and Applied Mathematics, Vol. 198 No. 2, pp. 493-504. available at: http://dx.doi.org/10.1016/j.cam.2005.09.027
Kaipio, J., V., Kolehmainen, E. and Somersalo, V.M. (2000), “Statistical inversion and Monte Carlo sampling methods in electrical impedance tomography”, Inverse Problems, Vol. 16 No. 5, p. 1487, available at: http://stacks.iop.org/0266-5611/16/i=5/a=321
Lipponen, A., Seppänen, A. and Kaipio, J.P. (2013), “Electrical impedance tomography imaging with reduced-order model based on proper orthogonal decomposition”, Journal of Electronic Imaging, Vol. 22 No. 2, p. 23008.
Neumayer, M., Bretterklieber, T., Flatscher, M. and Puttinger, S. (2017), “Pca based state reduction for inverse problems using prior information”, COMPEL - The International Journal for Computation and Mathematics in Electrical and Electronic Engineering, Vol. 36 No. 5, pp. 1430-1441.
Richardson, S., Spiegelhalter, D. and Gilks, W. (1996), Markov Chain Monte Carlo in Practice: Interdisciplinary Statistics (Chapman and Hall/CRC Interdisciplinary Statistics), 1st ed., Chapman and Hall/CRC.
Scott, D.M. and McCann, H. (2005), Process Imaging for Automatic Control (Electrical and Computer Enginee), CRC Press, Boca Raton, FL.
Tarantola, A. (2004), Inverse Problem Theory and Methods for Model Parameter Estimation, Society for Industrial and Applied Mathematics, Philadelphia, PA.
Vogel, C.R. (2002), Computational Methods for Inverse Problemss, Society for Industrial and Applied Mathematics (SIAM), Philadelphia.
Watzenig, D. and Fox, C. (2009), “A review of statistical modelling and inference for electrical capacitance tomography”, Measurement Science and Technology, Vol. 20 No. 5, pp. 1-22.
Wolff, U. (2004), “Monte carlo errors with less errors”, Computer Physics Communications, Vol. 156 No. 2, pp. 143-153. available at: www.sciencedirect.com/science/article/B6TJ5-4B3NPMC-3/2/94bd1b60aba9b7a9ea69ac39d7372fc5
This work is funded by the FFG Project (Bridge 1) TomoFlow under the FFG Project Number 6833795 in cooperation with voestalpine Stahl GmbH.