MCMC and GLMs for estimating regression parameters Evidence from non-life Egyptian insurance sector

Purpose – The purpose of this study is to estimate the linear regression parameters using two alternative techniques. First technique is to apply the generalized linear model (GLM) and the second technique is the Markov ChainMonte Carlo (MCMC) method. Design/methodology/approach – In this paper, the authors adopted the incurred claims of Egyptian non-life insurance market as a dependent variable during a 10-year period. MCMC uses Gibbs sampling to generate a sample from a posterior distribution of a linear regression to estimate the parameters of interest. However, the authors used the R package to estimate the parameters of the linear regression using the above techniques. Findings – These procedures will guide the decision-maker for estimating the reserve and set proper investment strategy. Originality/value – In this paper, the authors will estimate the parameters of a linear regression model using MCMC method via R package. Furthermore, MCMC uses Gibbs sampling to generate a sample from a posterior distribution of a linear regression to estimate parameters to predict future claims. In the same line, these procedures will guide the decision-maker for estimating the reserve and set proper investment strategy.


Introduction
Modeling of random events is one of the most vital research aspects in insurance and actuarial sciences. In insurance particularly, modeling and predicting the amount of claims has an extremely importance to both insurers and academics. In addition, Bayesian approach is one of the best statistical methods that estimate outstanding claims. In Bayesian modeling we should distinguish between the observable quantities and the unknown parameters that can be treated as random variables. Moreover, Bayesian approach provides a technique that combine prior information from the given data to estimate posterior distribution. In the same vein, posterior distribution can be used to describe the model parameters via mean, median, percentiles, point estimate and credible intervals. Markov Chain Monte Carlo (MCMC) simulation may follow Bayesian statistics to estimate parameters that is impossible to be estimated by maximum likelihood estimate (MLE) or other statistical methods. MCMC is a technique that is used for sampling probability mass functions or density functions. Furthermore, MCMC does not require optimization algorithm such as MLE and generalized methods of moments, but it provides small sample inference of parameters. MCMC has been improved to fit nonlinear regression models; this approach fills the gap in literature of non-life insurance market. In addition, MCMC uses Gibbs sampling to generate a sample from a posterior distribution of a linear regression to estimate the linear regression parameters. In the same line, generalized linear models (GLMs) can be used for non-identical residuals and nonlinear functions and it uses a transformation to increase straighten of the regression, GLMs is considers as an extension to ordinary least square method when the variances are not equal (i.e. heteroscedastic models). The aim of this paper is to estimate the linear regression parameters using MCMC and GLM methods for incurred claims of the non-life Egyptian insurance market. Data adopted in this research consist of ten year incurred claims from 2007/2008 to 2016/2017 of 22 non-life Egyptian insurance companies. Table I and Figure 1 describe the amount of incurred claims during the period from 2007/ 2008 to 2016/2017 for non-life Egyptian insurance market, we can notice that there is an increase in claims but there is a drop in the period between 2010/2011 to 2013/2014 The due to the Egyptian revolution.
The remainder of this paper will be as follows: Section 2 presents the literature review; Section 3 gives the methodology; Section 4 presents the models; Section 5 gives the empirical study; Section 6 concludes the paper.  Alba (2008) used Munich Chain Ladder (MCL) to optimize paid and incurred claims via MCMC method using WinBUGS software. This paper suggested many modifications to the MCL method. Moreover, he presented a Bayesian approach to the MCL.
Jackie (2007) compared many stochastic reserving methods such as MCMC and GLMs by considering the structure of the model, the assumption and estimation. This paper applied these methods on claims to estimate the outstanding claims and risk margin for each individual accident and aggregate risk margin. Pang et al. (2007) emphasized on modeling loss distributions for insurance claims, by considering Pareto distribution to calculate the probability of extreme claims. They used Bayesian and MCMC techniques to estimate Pareto parameters. Scollnik (2001Scollnik ( , 2004 reviewed several actuarial models that consider Bayesian method. Afterwards, he implemented the MCMC simulations for Bayesian estimation BUGS (Bayesian inference Using Gibbs Sampling) of reserves via several programming languages (e.g. WinBUGS).
Peremans et al. (2017) focused on claim reserving using GLMs on chain ladder based on past claims, also used an alternative technique to obtain inference by using bootstrapping. In addition, he estimated a distribution of risk measures using several bootstrap procedures. Boj and Costa (2017) estimated the parameters of loss distribution and predicted the error using GLMs to the claim amounts of a chain ladder method. Furthermore, they used a parametric family to estimate error distribution. In addition, they assumed a Poisson distribution with logarithmic link function as a deterministic chain ladder method. Verdonck et al. (2009) illustrated how to forecast claim reserves using two methods. Firstly, robust chain ladder method that observes outliers. Secondly, robust GLMs that estimate the claim reserve as if the data has no outliers. They concluded that the robust chain ladder method is showing a better performance than robust GLMs.
Carrato and Visintin (2019) introduced a new approach which is machine learning techniques in actuarial sciences that has more accuracy in prediction than traditional techniques. They focused on the elements of machine learning rather than traditional forecasting techniques to predict property and casualty loss reserving. Ravenzwaaij et al. (2018) introduced the MCMC methods as a technique that estimate the posterior distributions and provides the benefits and limitation of sampling using MCMC. Luoma et al. (2008) applied the Bayesian approach and regression method to evaluate the American-Style option, also used MCMC method to estimate the model and parameter errors. Moreover, they concluded that the proper choice of the model is a vital issue in risk management. Hogg and Foreman (2018) used MCMC to estimate the density function of the posterior distribution, fitting models to data and probabilistic inferences. In this paper, they illustrated the MCMC method and parameter estimation, they concluded that this method provides the best estimate. Zhang (2017) used Apache Spark across a cluster of computers to estimate distribution using Bayesian approach. In addition, he used Bayesian hierarchical Tweedie model to big data of insurance claims as a predictive model. Yu (2015) adopted a statistical model for health insurance claims, to predict future claims. In this paper, he used generalized exponential growth model (GEGM) and estimated the parameters of the model based on MCMC.
Lim (2011) applied the MCMC method to solve Bayesian method, estimate parameters and prediction of reserves. He also concluded that MCMC method is much better than classical methods (e.g. chain ladder and Bayesian over-dispersed Poisson model).

Data and methodology
Data adopted in this research consists of a 10-year time series of incurred claims for 22 nonlife Egyptian insurance market, these data reported in FRA since 2007/2008 to 2016/2017. In this paper, we will estimate the parameters of a linear regression model using MCMC method via R package. Furthermore, MCMC uses Gibbs sampling to generate a sample from a posterior distribution of a linear regression to estimate parameters to predict future claims. In the same line, these procedures will guide the decision-maker for estimating the reserve and set proper investment strategy.

Linear regression
Consider the linear regression model: where y is the dependent variable, x the independent variable, b 0 and b 1 are the parameters of the model and « is the white noise « $ N(0, s 2 ).

Generalized linear model
The GLM is formed with two ingredients: link function and variance function. The link function relates the means of the observations to the predictors (linearization), while the variance function relates the means to variances Lindsey (1997).
The link function can be expressed by: where the dispersion parameter 1 is a constant.

Bayesian statistics
Bayesian analysis emphasis in estimation of posterior distribution depending on prior distribution and the likelihood function of the parameters. In addition, normalize the final posterior distribution:  Andrieu et al. (2003) let X t be the value of a certain random variable at time t and possible values of X represents the state space. A stochastic process is considered as a Markov stochastic process if the state space depends only on the current state: That means to predict future value of the process we need only the current state, the probability of such event is called transition probability P(i, j) = P(i ! j):

Monet Carlo
According to Andrieu et al. (2003), assume that h(x) is a complex function and we need to find the integration of h(x): and h x ð Þ can be expressed as a function f x ð Þ multiplied by a probability p x ð Þ then: this function is the expected value of f x ð Þ over a density function p x ð Þ. If we draw a large number of observations x i , i = 1, 2, ., n of a random variable with density function p x ð Þ then:

Markov chain Monte Carlo
According to Bayesian statistics, MCMC method is an iterative sampling technique that allows sampling through P(u |x). MCMC is an effective approach that generates samples from posterior distributions P(u |x). Moreover, the target density is the posterior density p (u ) = P(u |x) and MCMC can be implemented when posterior cannot be formed properly: Suppose we seek an expectation m ¼ E p g u ð Þ Â Ã ¼ Ð g u ð Þ Â P u jx ð Þdu . As illustrated by Monte Carlo method above.

Descriptive statistics
In this section, we will illustrate the statistical characteristics of amount of incurred claims as shown in Table II.
From Table II and Figure 2 we can notice that the data is positively skewed, also the data is Platykurtic under the normal curve. Table III presents the results of GLM applied for incurred claims to estimate the linear regression parameters b 0 and b 1 . Moreover, Akaike Information Criterion (AIC) found to be 285.36 that represents the model selection that estimates the quality of the model. In addition, Figure 3 visualizes the residuals of the model and shows the Skewness of the data.

Markov chain Monte Carlo
In this section, we implement the MCMC method in R package to estimate the parameters of the linear regression; we performed 1000 iterations on MCMC that means we applied the      Evidence from non-life Egyptian insurance sector 6. Conclusion MCMC simulation may follow Bayesian statistics to estimate parameters. In addition, MCMC uses Gibbs sampling to generate a sample from a posterior distribution of a linear regression. MCMC is a technique implemented to estimate the linear regression parameters b 0 , b 1 and s 2 . In this paper, we adopted the incurred claims of non-life Egyptian insurance industry reported in Financial Regulatory Authority (FRA) during 10-year period, from 2007/2008 to 2016/2017 as an explanatory variable. We applied GLM to estimate the regression parameters and we performed 1000 iterations (i.e. 1000 Markov Chains) on MCMC to estimate the linear regression parameters on R package, MCMC performs many iterations of chains for sampling to estimate regression parameters that yield more information to reach the true values of the parameters. Moreover, these procedures will guide the decision maker for estimating the reserve and set proper investment strategy.