Solar power generation forecasting using ensemble approach based on deep learning and statistical methods

Solar power forecasting will have a significant impact on the future of large-scale renewable energy plants. Predicting photovoltaic power generation depends heavily on climate conditions, which fluctuate over time. In this research, we propose a hybrid model that combines machine-learning methods with Theta statistical method for more accurate prediction of future solar power generation from renewable energy plants. The machine learning models include long short-term memory (LSTM), gate recurrent unit (GRU), AutoEncoder LSTM (Auto-LSTM) and a newly proposed Auto-GRU. To enhance the accuracy of the proposed Machine learning and Statistical Hybrid Model (MLSHM), we employ two diversity techniques, i


Introduction
Photovoltaic (PV) technology has been one of the most common types of renewable energy technologies being pursued to fulfil the increasing electricity demand, and decreasing the The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Thanks are extended to the reviewers for their valuable comments which certainly improved the quality of the manuscript.
Publishers note: The publisher wishes to inform readers that the article "Solar power generation forecasting using ensemble approach based on deep learning and statistical methods" was originally published by the previous publisher of Applied Computing and Informatics and the pagination of this article has been subsequently changed. There has been no change to the content of the article. This change was necessary for the journal to transition from the previous publisher to the new one. The publisher sincerely apologises for any inconvenience caused. To access and cite this article, please use AlKandari, M., Ahmad, I. (2019), "Solar power generation forecasting using ensemble approach based on deep learning and statistical methods" Applied Computing and Informatics, Vol. ahead-of-print No. ahead-of-print. https://10.1016/j.aci. 2019.11.002. The original publication date for this paper was 06/11/2019. The current issue and full text archive of this journal is available on Emerald Insight at: https://www.emerald.com/insight/2210-8327.htm ML methods and statistical method achieved better accuracy than hybrid models that only combines machine learning models. The proposed MLSHM and Auto-GRU can be generalized to solve other time series problems such as financial markets, industrial markets, control engineering and astronomy.
The remainder of this paper is structured as follows. Section 2 presents the problem statement and the related work. Section 3 illustrates the methodology of ML models, the proposed Auto-GRU model, statistical model, and the proposed hybrid model. Section 4 shows the experimental results and performance validation of the proposed hybrid model. Finally, this research concludes in Section 5.

Problem statement and related work
The prediction problem can be formulated as follows: Given a time series of historical weather data in the form ðx 1 ; x 2 . . . x n Þ, where n represents the number of weather parameters. To predict day-ahead solar PV powerb y, a conventional approach is to define a mapping function f between the historical weather data and future solar PV power as follows: b Figure 1 shows a simple representation of the solar PV power prediction system with n56 weather parameters. Numerous research studies have introduced ML algorithms as forecasting models in different application related to the field of renewable energy. Several ML methodologies such as support vector machine (SVM) [20], long short-term memory (LSTM) [21], and K-nearest neighbor (K-NN) [22], have been applied to predict solar irradiance, which can be considered as the first step towards solar PV power forecasting. A gradient boosted regression tree model (GBRT) was conducted by Persson et al. in [3] to predict multi-site solar power generation on a forecast horizon of one to six hours ahead. The GBRT model was mainly designed for classification; however, it has been extended to regression. Furthermore, GBRT is a ML model that combines the output of many small regression trees of fixed size to generate better result. Unlike the conventional time series methods, the proposed model has no updating procedure or recursive version as new observations arrive, which add a considerable limitation to their proposed model. The authors in [4] proposed a least absolute shrinkage and selection operator (LASSO) based forecasting model for solar power generation. LASSO based model assists in variable selection by minimizing the weights of less important variables and maximizing the sparsity of the overall coefficient vector. They compared the predicted solar power from their proposed algorithm with two representative schemes, SVM and a time-series based method known as TLLE method. The results showed that LASSO based algorithm achieved more accurate forecasts of solar power than the representative schemes using fewer training dataset. As a supplement to the proposed algorithm by Tang et al. [4], the authors in [5] integrated LASSO with LSTM as a forecasting model for solar intensity prediction. Their proposed model attained better performance in short-term solar intensity prediction rather than long-term prediction.

Solar power generation forecasting
Furthermore, power forecasting of solar power plants using AutoEncoder (AE) and LSTM neural network was developed by [6]. They used the encoding side of the AE to realize the most effective features for learning and attached it to the LSTM network. Accordingly, the LSTM uses the learned encoding data as an input to predict the solar power generation of a PV plant. The model showed powerful results compared to multilayer perceptrons (MLP), LSTM, deep belief networks, and AE. Different from [6], the authors in [7] combined the AE with LSTM into augmented long short-term memory (A-LSTM) forecasting model. The algorithm was tested on different datasets, and it performed well on time series datasets. De et al. developed LSTM based model to predict PV power using limited dataset [8]. Although the LSTM model predicted accurate results, it was shown that increasing the amount of data and features will improve the performance of the LSTM model. Wang et al. proposed a GRU based short-term PV power forecasting algorithm [9]. GRU model is used to reduce the long training time compared to LSTM model as well as improve the accuracy of the output. They concluded that GRU outperformed the traditional ML models such as SVM, autoregressive integrated moving average model (ARIMA), LSTM, and back propagation neural network.
The authors in [23] surveyed state-of-the-art ML models used in different renewable energy systems. Moreover, they identified and classified many ML models applied in different applications in energy applications and explored various research in different energy systems. Rana et al. demonstrated a comprehensive comparison on various methods that is used for prediction purposes [24]. They explored six different methods: Neural Network (NN), SVM, K-NN, Multiple Linear Regression (MLR) and two persistent methods. The results showed that ensemble of NNs is the most promising accurate method compared to other predicting methods. Statistical approaches have not gathered the attention of researchers as much as ML models, especially in solar PV power forecasting. As mentioned earlier, Makridakis et al. illustrated that Theta method was the most accurate and simpler statistical method that performed partially well in M3 Competition [15].
The authors in [25] classified the ensemble methods into two categories: competitive ensemble forecasting and cooperative ensemble forecasting. Ensemble learning is a promising model which has attracted lot of attention in recent years. Ahmed et al. [14] proposed three different ensemble approaches to predict day-ahead solar power generation namely: (i) linear, (ii) normal distribution, and (iii) normal distribution with additional features.
They concluded that all the ensemble methods when combined together showed better performance than the individual ML models. Gigoni et al. compared several ML forecasting methodologies, e.g., K-NN, support vector regression (SVR), and quantile random forest and evaluate their prediction accuracy in solar PV power application [26]. The experimental results showed that aggregating the output of single prediction models surpassed all the ML learning models explored in their research under any weather condition. Feng et al. grouped the weather data into hourly similarity-based approach and used those grouped data to train a two-layer ML hybrid model to be able to predict one hour ahead solar irradiance [27]. Their results showed that the hybrid model performed better than any single ML model used in the hybrid model. Another study by Koprinska et al. illustrated static and dynamic approaches to ensemble the solar power prediction of NNs [28]. Their experiment showed overwhelming results for the ensemble approaches compared to bagging, boosting, random forest, and four single prediction models (NN, SVM, K-NN, and a persistence model). Limited research can be found on ensemble statistical models for superior performance and accuracy. The authors in [16] explored eight ensemble techniques to combine the results of six best models from each family of statistical models: SARIMA (36 models), ETS (30 models), MLP (1 model), STL decomposition (2 models), TBATS (72 models) and the Theta model (1 model). Although ensemble learning showed high efficient performance and accurate results in many papers, the results in [16] showed marginal enhancement from the best model. The reason being that the models tested resulted in highly correlated errors. This led us to emphasizes that diversity is the key toward major enhancement of ensemble methods. The authors in [29] presented a cluster-based approach applied to the global solar radiation. Their approach then predict the horizontal global solar radiation using a combination of two Ml models: SVM and ANN. The results showed higher predicting accuracy compared to the conventional ANN and SVM.
In summary, we explored much research in solar power forecasting that combine ML models to enhance prediction accuracy. Substantially, all the research we explored confirmed that diversity is an essential and fundamental procedure for a powerful ensemble model. Moreover, it was found that most ensemble based studies used the conventional ensemble-ML methods such as bagging, AdaBoost, and stacked generalization to apply diversity, train the same ML models and aggregate the results into one complete model. However, this study incorporates the prediction resulted from ML models and the prediction resulted from the statistical model, which cannot be achieved by the conventional ML-based ensemble models. Furthermore, research that aggregates ML model and statistical models in solar PV power forecasting is non existent. Hence, this study could be considered the first in solar power forecasting which adds a value to this field of research.

Design methodology
In this section, proposed solutions to enhance and reinforce the performance of solar PV power forecasting algorithms are presented.

ML forecasting algorithms
In this section, we introduce various ML algorithms adopted in this study to build and construct the hybrid model. We have chosen the most accurate ML models that have been applied in forecasting applications such as LSTM [17], GRU [30], and Auto-LSTM [6]. Additionally, a new ML model (Auto-GRU) has been proposed to the exiting ML forecasting algorithms bundle.
3.1.1 LSTM. The long short-term memory networks (LSTM) are special recurrent neural networks (RNNs) that were first introduced in [17]. RNN is a neural network (NN) with recurrent connections between the neurons that enables it to learn from the current and the previous information to find a better solution. However, when two cells in RNN are far away from each other, it is difficult to obtain useful information due to the gradient vanishing and explosion problems. The solution to this are special neurons called memory cells. Using these special neurons, the LSTM is able to store useful information over an arbitrary period of time. Moreover, LSTM cells have the ability to learn what data needs to be read, stored, and erased from the memory by adjusting three different controlling gates, namely forget gate f ðtÞ, input gate iðtÞ, and output gate oðtÞ as shown in Figure 2.
The forget gate ðf ðtÞÞ decides either to keep or discard the information from the cell state, shown in Eq. (2). A logistical function generates either 0 or 1 as a value for f ðtÞ that represents either to abandon or keep the current cell state at time step t, respectively. Where w, h, x, and b are the weight, output, input, and bias, respectively. The activation function is represented by σ.
On the other hand, the input gate controls what input values can be stored in the cell state as shown in Eq. 4. Where iðtÞ represents the signal (0 or 1) that controls the updating procedure, gðtÞ is new candidate value and cðtÞ is the new state of the cell. Moreover, the output gate ðoðtÞÞ is responsible for releasing the stored information to the next neurons as shown in Eq. 6.

Solar power generation forecasting
Hence, LSTM is a powerful ML model to capture long-term dependency as well as the non-linear relationship in a complex dataset.
gðtÞ ¼ tanhðw g ½hðt À 1Þ; xðtÞ þ b g Þ (5) hðtÞ ¼ oðtÞ * tanhðcðtÞÞ (7) 3.1.2 GRU. The gated recurrent unit (GRU) is a special case of LSTM introduced by Cho et al. [30] to reduce the long training time of LSTM. Compared to LSTM, GRU has fewer controlling gates as it lacks an output gate. As shown in Figure 3, GRU is much simpler than LSTM since it includes only two gates, the reset gate and update gate, that control the information flow inside the units. The transition functions between neurons of GRU are given as follows: where rðtÞ is the reset gate, zðtÞ is the update gate, and w and u represent the parameter matrices in GRU. Furthermore, hðtÞ; b hðtÞ, and b are the output, candidate output, and bias, respectively. The activation function is represented as σ.

Auto-LSTM.
The Auto-LSTM model proposed by [6] consists of two ML algorithms: AutoEncoder (AE) [31] and LSTM. An AE is an unsupervised neural network where the input and the output layers have the same size. An AE tries to learn the identity function so that the input x is approximately similar to the output b x with some constraints applied to the network, e.g., a limited number of neurons in the hidden layer compared to the input layer. Therefore, an AE acts as a compressor and a decompressor consisting of two parts separated by a bottleneck at the center: (i) encoding side, where the neurons are reduced from the input layer to the hidden layer, and (ii) decoding side, where the layers in the encoding side are reflected as shown in Figure 4.
Thus, an AE is able to learn and discover the correlations in the input features and the special structure of the data. Gensler et al. [6] have utilized the encoding side of the AE to realize the feature extraction and attached it to the LSTM model. Hence, the LSTM model is trained and fitted using the historical encoded weather data produced by the encoding side of the AE as well as the corresponding solar PV power. The end result of this Auto-LSTM model is an ML algorithm that is able to predict the solar PV power generated from a PV farm given the meteorological data.

Auto-GRU.
The proposed Auto-GRU model has similar characteristics of the Auto-LSTM model presented in [6]. As explained earlier, the AE is used to shrink the meteorological data by discovering the structure of data and attach the encoded data to the GRU network. The encoded meteorological data as well as the corresponding historical solar PV power will be fed to the GRU network to fit and train the neurons to be able to predict the desired output. Specifically, the Auto-GRU model is trained by a set of historical encoded meteorological data and the corresponding PV power (encoded meteorological parameters, PV power). Figure 5, represents the block diagram of the proposed Auto-GRU model. The forecasting process using Auto-GRU can be summarized as follows: 1. The historical weather data is encoded using the encoding side of the AE.
2. The encoded weather data is split into a training set and a testing set. A small percentage of training data is preserved for validation.
3. All the training, validating and testing sets are reorganized into chunks or windows that represent the number of historical previous days (window size). 5. The proposed model is tested on the testing set that was rearranged into chunks of previous samples.
6. Finally, the proposed Auto-GRU model is a powerful ML model that is ready to predict the solar PV power given a chunk of previous meteorological data.

Statistical forecasting algorithm
Based on research by Makridakis et al. [15], we decided to use Theta model to represent the statistical part of the study. This statistical model is considered to be the most accurate and the simplest model compared to the other statistical models examined in M3 Competition [15]. Assimakopoulos et al. [19] have proposed a Theta model as a decomposition approach for forecasting applications. This model is based on modifying the local curvature of the time series data using a coefficient called ThetaðθÞ that is applied to the second derivative of the data as shown in Eq. (12).
The new time series lines are called theta lines and maintain the mean and the slope of the original time series. Moreover, the deflations of the new time series curvatures depend on the value of Theta coefficient, i.e., to identify the long-term behaviors of the time series dataset programed the Theta coefficient to be between 0 and 1 (0 < θ < 1). However, when θ > 1 the new theta line is more dilated, it affects the short-term trends. The theta lines are then extrapolated separately and combined to generate the forecasted solar PV power. The authors in [19] decomposed the original time series into two theta lines by setting the Theta coefficient to θ 5 0 and θ5 2. The first line (L (θ 5 0)) represents the linear regression line of the original time series magnifying the long-term trends. The second line (L (θ 5 2)) doubles the original curvature, magnifying the short-term trends. In this, the forecasting process is accomplished by linearly extrapolating the first theta-line while extrapolating the second line using simple exponential smoothing (SES). Afterward, the forecasted time series of the two theta-lines are simply combined via equal weights resulting in the final forecast of a specific time series dataset.

Ensemble forecasting algorithms
Ensemble of the prediction models is a key enhancement to the prediction accuracy that has grabbed the attention of many researchers in recent years. In our study, we proposed ML and statistical hybrid model (MLSHM) that combines the solar PV power predicted from various ML models and statistical model. Moreover, several ensemble methods were employed to combine the predictions of different models and generates the final solar PV power prediction. To boost and raise the benefits of aggregating ML models with statistical method, enforcing diversity between the combined models, is an essential procedure. In this research, the diversity in ensemble methods falls onto two categories: (i) data diversity, and (ii) structural diversity.
Data diversity is achieved by generating multiple datasets from the original dataset to train the ML models [25,32]. Structural diversity is attained by having different architectures of the prediction models [32]. In our study, we introduced data diversity within the ML models and structural diversity by combining two differently structured algorithms, i.e., ML models and statistical model. As a result, data diversity was applied to the combined ML models as follows: 1. After the dataset was split into a training set and a testing set, the training set was further divided into n training sub-sets, where in our study n 5 2.
2. Each ML model is trained on one of the training sub-sets; hence, we have n ML models trained on different sets of data, where in our study n 5 2, hence we have two ML models trained on different sets of data.
3. All the ML models are tested on the same testing set for comparison purposes to ensure equality of results between the models.

Solar power generation forecasting
In the first stage, the ML models and the statistical models predicted the solar PV power separately, and then we combined these results to get the final forecast. We explored four different ensemble methods to test their effectiveness on the proposed MLSHM. The ensemble methods are described as follows: 1. EN1: simple averaging approach, which is the simplest and the most natural method that generates the final forecasted solar PV power by taking the mean value of the forecasts resulted from the ML models and statistical models. The final solar PV power is generated as follows: Here m denotes the number of ensemble members which is m 5 n þ k and b y is the predicted solar PV power, b y j represents the predicted solar PV power of ensemble member j, and b y is the final solar PV power forecast.
2. EN2: weighted averaging using linear approach, where the final solar power is accomplished by assigning different weights to the combined models based on the its accuracy. Accordingly, a model with a higher error rate is given a lower weight so that it will minimally contribute in the final prediction. Conversely, a model with a lower error rate (more accurate) is given a higher weight; therefore, it will add more value to the final forecast. The weights are calculated while ensuring that the summation of all the ensemble models' weights equal 1. Eq. 14 demonstrates the model weighting equation used in EN1 and Eq. 15 represents the weighted averaging calculation.
Here w i represents the weight given to model i, and nMAE i is the normalized Mean Absolute Error of model i.
3. EN3: weighted averaging using non-linear approach, where it uses the same concept of EN1 but the models' weights are calculated as a softmax function of the negative of its error (nMAE): Here exp denotes the exponential function. The final solar power forecast is calculated by Eq. (15).
4. EN4: combination through variance using inverse approach this is simply the weighted averaging using the following weighting equation: The final prediction of solar PV power is calculated by the weighted averaging as in Eq. 15.

Experimental results and discussions
In this section, we present a comprehensive experimental study to evaluate these proposed methods in solar PV power forecasting application. We compared a new ML method (Auto-GRU) with other ML algorithms (GRU [30], LSTM [17], and Auto-LSTM [6]) and statistical method (Theta model [19]) in terms of accuracy. Four different ensemble methods were used to build the MLSHM and will be compared with the accuracy with the traditional methods.
The proposed algorithms were tested on different solar PV systems starting from a single PV panel to large-scale PV farms.

Dataset description
The i.e., five-minute, hourly, daily, and monthly. In this research we conducted the experiments on five-minute resolution data. Marion et al. [34], collected data for different technologies of PV modules at three climatically diverse locations in the USA (Cocoa, Florida; Eugene, Oregon; and Golden, Colorado). The time series data are available in five-minute resolution from January 21, 2011, to March 4, 2012. We conducted our experiments on one sample of the USA dataset and tested our proposed methods on a single Poly-SI module from Cocoa, Florida. The dataset consists of both the total power produced from Poly-SI module (W) and the weather data gathered from the site such as solar irradiance (W/m 2 ), ambient temperature (8 C), humidity (%RH), and precipitation (mm).
As collecting data from outdoor data acquisition devices could suffer from incorrect and missing readings from the sensors, data preprocessing is an essential procedure that cleans the data before verifying the effectiveness of the proposed methods. In this research, all time-series datasets were reclaimed from missing, undesired, duplicated, and incorrect readings. The data were then grouped into daily resolution and normalized between 0 and 1 using min-max normalization in order to decrease the training time and facilitate the comparison of the different PV plants' size. Afterward, the dataset was divided into a training set and a testing set consisting of 90% and 10% of the data, respectively. A 10% of the training data was preserved for validation. Table 1 summaries the time series datasets used in this paper.

Design of experiments
To evaluate and verify the effectiveness of the proposed ML algorithm (Auto-GRU) and MLSHM, we conducted several experimental simulations to forecast day-ahead solar power generation. All ML algorithms and ensemble methods were implemented using Keras API [35] for Python v3.7 along with TensorFlow framework v1.10 [36]. Theta model was constructed using thetaf function of the forecast R statistical package. Furthermore, we implemented GRU, LSTM, Auto-GRU, and Auto-LSTM models with 3-layer topology, that is, one input layer, one hidden layer with 20 neurons, and one output layer. The selection of 20 neurons in the hidden layer is based on testing different configurations and selecting the one which provided the best accuracy. Accordingly, the input layer represents the meteorological Solar power generation forecasting data while the output layer represents the forecasted solar PV power. Hyperbolic Tangent (tanh) was added as an activation function to introduce non-linearity in the network. The models were fitted using efficient RMSProp optimization algorithm and the mean squared error loss function. In addition, various hyper-parameters were used to fit ML models in order to carry out several experimental scenarios. These experimental configurations were used to fine tune the Auto-GRU model on the three datasets (Shagaya Poly-SI, Shagaya TFSC, and Cocoa single Poly-SI). The design of the experiments were tested such that the parameters are set as follows: window size is 2, update weights after 20 batches and execute 50 epochs to train the model. In this research, we used two popular evaluation measures to compare the accuracy of the predicted normalized power from Auto-GRU model and MLSHM. Specifically, normalized Mean Absolute Error (nMAE) as shown in Eq. 18 and normalized Mean Square Error (nMSE) as shown in Eq. 19. Here, y is the normalized actual power, b y is the normalized predicted power, and N is time series samples.

Experimental results
As mentioned earlier, four ensemble methods (EN1, EN2, EN3, and EN4) were implemented to combine the solar PV power predicted from different ML methods and Theta model. Various combinations were applied to comprehensively evaluate the proposed MLSHM.  Hence, Theta model consolidated the performance and the accuracy of the forecasted solar PV power.

Analysis of results
By examining Tables 2,3 it can be seen that MLSHM is a powerful hybrid model that outperforms all other hybrid models and traditional single models tested. Moreover, MLSHM is more accurate than MLHM, which means that Theta model has significantly improved the accuracy of the prediction. We have highlighted the least nMAE and nMSE in bold as shown in Tables 2,3. The lowest nMSE for Shagaya Poly-SI is 0.0197 using EN4 for T-MLSHM which is 18.93% better than GRU model as shown in Table 2. Furthermore, the power prediction for Shagaya TFSC farm attained an nMSE of 0.00185 using EN4 for T-MLSHM, a 36.21% improvement over GRU model. Table 4 highlighted the least nMSE for Cocoa single Poly-SI dataset, which reached 0.0168 for EN4 of T-MLSHM that resulted in a 4.55% reduction of error rate and improved performance. Therefore, T-MLSTHM using EN4 was the most accurate ensemble method since it increased the accuracy of solar power prediction range between 4.55% and 36.21% over the single traditional models. On the other hand, Auto-MLSHM using EN4 increased the accuracy of the solar power prediction range between 8.15% and 28.90% compared to Auto-GRU model. One reason for this is that EN4 distributes the weights efficiently where the most accurate model has the greatest weight value and is not close in value to the other weights. From the single prediction models, GRU achieved the best accuracy over other traditional ML models and the Theta statistical model. Figure 8 shows the actual solar PV power generation compared to the predicted solar PV power from different models tested in this study on the three datasets; Shagaya Poly-SI, Shagaya TFSC, and Cocoa single Poly-SI, respectively. We can see that the prediction models perform better for Shagaya dataset rather than Cocoa dataset because it contains more relative weather data that affects the performance of a PV model. The Cocoa dataset represents the power of a single Poly-SI module, which appeared to be unstable. In contrast, Shagaya dataset represents the power of a large scale solar PV farm that is more stable. Hence, the prediction of a single PV module is more difficult than the prediction on a large scale PV farm since a single PV panel can easily be affected by a passing cloud. We can conclude that the proposed hybrid model (MLSHM) is a powerful, efficient, valuable, and accurate prediction model.  Auto-GRU  EN1  EN2  EN3  EN4  EN1  EN2  EN3  EN4  EN1  EN2  EN3  EN4  EN1  EN2  EN3 Table 4. Prediction performance on nMAE and nMSE of cocoa single poly-SI dataset.

Solar power generation forecasting
We can summarize the results obtained during this research as follows: Hybrid models outperform all tested traditional ML models and the Theta statistical model.
Hybrid models with Theta model (MLSHM) obtained better accuracy than hybrid models without the Theta statistical model (MLHM).
All ensemble methods (EN1, EN2, EN3, and EN4) achieved better accuracy than any single ML algorithm and theta model.
Almost all ensemble methods achieved similar accuracies; however, EN4 was found to be the most accurate ensemble method used in this research.
GRU model is the most accurate ML model followed by Auto-GRU, LSTM, and Auto-LSTM.
ML models performed better than theta statistical model in predicting solar power generation.

Conclusions
Integrating large-scale PV plants into the power grid poses considerable problems and challenges to the electric operators, as it causes instability to the electric grid causing the electrical operators to balance the electrical consumption and power generation in order to avoid waste of energy. Therefore, an accurate solar power forecast is a fundamental requirement toward the future of renewable energy plants. In this research, we proposed a hybrid model (MLSHM) that combines the prediction results of both ML models and statistical method. For our study we developed a new ML model, Auto-GRU, that learns from historical time series data to predict the desired solar PV power. In order to boost the hybrid model, two diversity techniques were conducted in this study, i.e., structural diversity between the ensemble members and data diversity between the training sets of the ML models. Four different combination methods illustrate to combine the prediction of ML models and statistical method. The proposed hybrid model and Auto-GRU model tested on two real-time series datasets of solar PV power and weather data collected from Shagaya located in Kuwait [33] and Cocoa, Florida, USA [34]. The experiments allow us to conclude that a hybrid model combining the prediction of ML models and statistical method obtain higher accuracy than a hybrid model combining the prediction of ML models without statistical method. Our future work is to test and validate the hybrid model on other ML models and statistical method. Moreover, we will try other diversity techniques such as dividing the training set by parameters as well as testing both data and parameters diversity. In addition, we plan to develop other ensemble techniques to boost the accuracy of the prediction.