Forecasting the Finnish house price returns and volatility: a comparison of time series models

Purpose – The purpose of this paper is to compare different models’ performance in modelling and forecasting the Finnish house price returns and volatility. Design/methodology/approach – The competing models are the autoregressive moving average (ARMA) model and autoregressive fractional integrated moving average (ARFIMA) model for house price returns. For house price volatility, the exponential generalized autoregressive conditional heteroscedasticity (EGARCH) model is competing with the fractional integrated GARCH (FIGARCH) and component GARCH (CGARCH) models. Findings – Results reveal that, for modelling Finnish house price returns, the data set under study drives the performance of ARMA or ARFIMA model. The EGARCH model stands as the leading model for Finnish house price volatility modelling. The long memory models (ARFIMA, CGARCH and FIGARCH) provide superior out-of-sample forecasts for house price returns and volatility; they outperform their short memory counterparts in most regions. Additionally, the models’ in-sample fit performances vary from region to region, while in some areas, the models manifest a geographical pattern in their out-of-sample forecasting performances. Research limitations/implications – The research results have vital implications, namely, portfolio allocation, investment risk assessment and decision-making. Originality/value – To the best of the author’s knowledge, for Finland, there has yet to be empirical forecasting of either house price returns or/and volatility. Therefore, this study aims to bridge that gap by comparing different models’ performance in modelling, as well as forecasting the house price returns and volatility of the studiedmarket.


Introduction
Forecasting house price returns and volatility is vital for numerous sectors such as consumers, policymakers, investors and risk managers. The reasons being, firstly, the housing assets' dual role of investment and consumption; thus, accurate forecasting of house price dynamics plays a crucial role in asset allocation and investment decision-making. Secondly, housing is a substantial component of the country's economy. Notably, in Finland, over half of the households' total wealth (50.3%) is in the form of housing (Statistics Finland, 2016). In the USA, housing is the largest component of household wealth; it represented, respectively, 28.3 and 24.6% of the total households' net worth and households' asset (Financial Accounts Data, 2018). In the UK, Savills (2019) estimated the housing stock total value to £7.29tn, highlighting an essential part that housing and its market have in the sustainability of the economy. Thirdly, housing affects the country's economy by influencing many parties involved in housing and mortgage activities. Therefore, accurate house price forecasting would benefit consumers and mortgage parties (Segnon et al., 2020). Last, insights into house price dynamics provide recommendations to the housing policymakers and they are the fundamental inputs in outlining housing plans and policies, as stressed by Zhou and Haurin (2010).
Having noted the importance of the housing market, house price analysis of individual markets has been the subject of an increasing amount of studies. However, the focus has been on a restricted number of countries, namely, the USA, UK, Canada and Australia (Apergis and Payne, 2020). For Finland, even though over half of the households' total wealth is in the form of housing, as reported by Statistics Finland (2016), there has yet to be empirical forecasting of either house price returns or/and volatility. Therefore, this study aims to bridge that gap by comparing different models' performance in modelling as well as forecasting the house price returns and volatility. Thereby providing the information on the accurate model for modelling and forecasting the Finnish housing market, moreover extending the ongoing literature on the analysis of the housing market of various countries.
The purpose of the study is to find the most suitable and accurate model for Finnish house price returns and volatility modelling and forecasting. The number of rooms is used to categorise the studied dwellings, that is, one-room, two-rooms and larger (over three rooms) apartments. The 15 studied regions are distributed into 45 cities and sub-areas following their Zone Improvement Plan (ZIP)-code or postcode numbers. The competing models are the autoregressive moving average (ARMA) model and autoregressive fractional integrated moving average (ARFIMA) model for house price returns. The exponential GARCH (EGARCH) model, the fractionally integrated GARCH (FIGARCH) model and the component GARCH (CGARCH) model for house price volatility. The models' choice derives from Dufitinema and Pynnönen's (2020) and Dufitinema's (2020) studies outcomes. After testing for ARCH effects, the former article found grounds of long-range dependence in the house price returns and volatility for a greater number of the Finnish cities and sub-areas. The latter article used the EGARCH model and found that shocks' asymmetric impact on housing volatility was recorded in nearly all the Finnish cities and sub-markets. Therefore, to develop time-series models suitable for this housing market forecasting exercise, for cities and sub-areas with no ARCH effects, the short memory ARMA model's forecasting performances and long memory ARFIMA model are compared. For cities and sub-areas with substantial clustering effects, a short memory GARCH model, in this case, the EGARCH model's forecasting performance is weighed up to the GARCH models, which accommodate the long memory in the conditional variance; those are FIGARCH and CGARCH models. To assess the models' out-of-sample forecasting performances, the data is split into training and test sets. The former set is used to estimate the model and build predictions; the latter is used to evaluate the model produced forecasts. Results reveal that the house price return understudy drives the models' performance for the in-sample fit examination. While the EGARCH model is the best-ranked model for house price volatility modelling. The long memory models outclass their short memory peers in the out-of-sample performance in modelling as well as forecasting the house price returns and volatility. Furthermore, previous studies used the family-home property type data sets; the article at hand, however, uses apartments (also referred to as, a block of flats) type data. The number of rooms categorises the studied dwellings: one-room, two-rooms and larger apartments (over three rooms) types. The reasons for using flats property type data are their fast-growing popularity as a place to live in Finland and their increased attractiveness to both consumers and investors. At the end of 2018, Statistics Finland Overview reported that apartments counted for nearly half of all occupied dwellings, they represented 46%. Detached and semidetached was the second favourable house type, with 39%, followed by terraced with 14%. Regarding the investment aspect, apartments continue to strengthen their position in the Finnish residential property market with foreign, domestic as well as individual investors continue to increase their portfolios across the country (KTI, 2019). In addition, in the same viewpoint of housing investment and portfolio allocation, this analysis uses metropolitan as well as ZIP-code level data for cross-examination and comparison of housing investment on the city and sub-market levels.

Data and methodology
The data used in this study are quarterly house price indices, retrieved from Statistics Finland's PxWeb databases (2020). The number of rooms categorises the studied types of dwellings: oneroom, two-rooms and larger (over three rooms) apartment types. The considered period spans from the first quarter (Q1) of 1988 to the fourth quarter (Q4) of 2018 and the 15 considered regions are Helsinki, Oulu, Tampere, Lahti, Pori, Turku, Seinäjoki, Jyväskylä, Lappeenranta, Kuopio, Hämeenlina, Vaasa, Kotka, Joensuu and Kouvola. The regions of Helsinki, Turku and Tampere form an important and growing area, called the growth triangle in Southern Finland. Currently, the area accounts for, respectively, 49 and 55.5% of the Finnish population and total gross domestic product (GDP). The Oulu region, called the Northern Finland growth centre, is also amongst the well-performing region with substantial economic development and population growth. The other regions also show significant expansion and economic performance. These regions are then divided into 45 cities and sub-areas according to their ZIPcode or postcode numbers. Dufitinema (2020) details the regions' ranking and division. The number of inhabitants ranks regions and postcode numbers divide them.
The methodology used in this study is an extension of Dufitinema's (2020). That is, house price indices are transformed into log-returns. The process is done for each city and sub-area in every apartment type. Next, first-order autocorrelations are filtered out from the returns. The task is done by determining the appropriate order of the ARMA model using the Akaike and Bayesian information criteria (respectively, AIC and BIC). Then, from the transformed returns, ARCH effects are tested. Thereafter, the current study extends this methodology by examining the ARMA and ARFIMA models' forecasting performances for cities and sub-areas with no substantial ARCH effects. The EGARCH model's forecasting abilities are compared to the FIGARCH and CGARCH models for cities and sub-areas with substantial clustering effects.
Regarding testing for ARCH effects, details are given and results are described in Dufitinema (2020). In a nutshell, both used tests Lagrange Multiplier (LM) and Ljung-Box (LB) found, in all three considered types of apartments, that clustering effects were significant in the majority of the cities/sub-areas. Specifically, the results are as follows: in the one-room flats category, the evidence of clustering effects was found in 28 out of 38 cities/sub-areas. In 27 out of 42 and 31 out of 39 in, respectively, the two-rooms and larger (over three rooms) flats category. Moreover, as in forecasting the house price dynamics of the considered types of dwellings, short memory and long memory time series models are compared, we make use of Dufitinema and Pynnönen's (2020) study outcomes. The results summary is as follows: in those cities/sub-areas with no significant clustering effects, in the one-room apartment type category, 8 out of 10 exhibited long memory behaviour. Meaning that their Geweke and Porter-Hudak (1983) (GPH) estimates of the fractional differencing parameter d varied from 0 to 0.5. The two returns series were anti-persistent [d 2 À0:5; 0 ð Þ ]. In both two-room and larger (over three rooms) apartment categories, one sub-area displayed anti-persistence behaviour while the rest 14 and 7 returns series exhibited longrange dependence behaviour in the respective groups. These results are used as hyperparameters of the ARFIMA models in the estimation procedure.
The same applies to Dufitinema and Pynnönen's (2020) findings on the long-range dependence in those cities/sub-areas with substantial ARCH effects. In squared as well as absolute house price returns, in all three apartment types, the fractional differencing parameter d was estimated and the outcomes indicated a very persistent long memory behaviour in the house price volatility. Both metrics results are used as hyperparameters of the FIGARCH models in the estimation procedure and the best model is assessed based on different model selection tools. This approach of tuning the parameter d, that is, estimate the long memory parameter first and get the other parameters estimations using these d estimates, is at the core of most semiparametric estimation approaches (Lopes and Mendes, 2006;Härdle and Mungo, 2008). Furthermore, as pointed out by different researchers such as Tsay (2013), when GARCHtype models are used to assess asset returns, an assumption of a normal distribution is not tenable. An appropriate distribution must accommodate asset returns characteristics, for instance, skewness and fat tails. Therefore, based on AIC and BIC, appropriate distribution is selected, for each city and sub-area in every apartment type, amongst univariate distributions, namely, Student t ("Std"), Generalised Error ("GED") and their skew variants ("sStd" and "sGED").
3.1 Models for forecasting house price returns House prices returns are predicted for cities/sub-areas with no substantial clustering effects, meaning those regions with both constant mean and variance. The types of models tested relate to this constant mean/variance specification of the series. The ARMA models fulfil this property; however, they do not capture the long-memory behaviour that house price returns of these cities/sub-areas exhibit. Therefore, their forecasting performances are compared to the models that accommodate the high persistence present in the returns series; those are ARFIMA models.
3.1.1 Autoregressive moving average model. ARMA models have been a leading major of modelling and forecasting in numerous areas of finance and economics. In the housing market, we refer to Jadevicius and Huston (2015) and the references therein. Jadevicius and Huston assess the ARMA's application for forecasting the Lithuanian housing market in particular and extend their findings to the global housing market. The ARMA model is a combination of AR and MA processes (Box et al., 1994). Its standard specification is as follows: where X p i¼1 w i r tÀi represents the AR portion of the model and X q i¼1 u i a tÀi represents the model's MA portion. By assumption, r t is stationary, for a collect specification of the ARMA Comparison of time series models model; otherwise, the series has a unit root and it is termed as AR Integrated MA (ARIMA) process. However, Dufitinema and Pynnönen (2020) have conducted unit root tests on the studied house prices returns and concluded that the null hypothesis of a unit root in all return series in all the three apartment types was rejected at least at the 5% level. Hence, stationarity was ensured across all cities and sub-areas, in all apartment types. 3.1.2 Autoregressive fractional integrated moving average model. ARFIMA models are the extension of the ARIMA models to accommodate the time series's long-memory behaviour. They were independently put forwarded by Granger and Joyeux (1980) and Hosking (1981). The standard specification of an ARFIMA model is as follows: where Y t denotes the discrete-valued studied time series, d is the fractional differencing parameter and e t is a white noise with E e t ð Þ ¼ 0 and variance s 2 e . L is the lag operator or back-shift operator such that LY t ¼ Y tÀ1 . U(L) and H(L) are the AR and MA polynomials in the lag operator, respectively. That is, U The value of dthe long memory parameterdictates the properties and the interpretations of the ARFIMA model. If d = 0, ARFIMA reduces to ARIMA and the process is stated to exhibit short memory. If d 2 À0:5; 0 ð Þ , it is characterised as anti-persistence or long-range negative dependence. The process is said to manifest long memory or long-range positive dependence if d 2 0; 0:5 ð Þ and it is non-stationary with mean reversion if d 2 0:5; 1 ½ Þ, whereas it becomes non-stationary without mean reversion if d ! 1.

Models for forecasting house price volatility
For regions with time-varying variance, meaning those cities and sub-areas with evidence of ARCH effects, GARCH-type models are used to forecast house price volatility. Motivated by the persistence or long memory behaviour found in these cities/sub-areas' house price volatility, short memory GARCH models are compared to the GARCH models that accommodate the long memory property. The EGARCH model is selected amongst the short memory GARCH models, over the standard GARCH. The grounds of the EGARCH selection are the evidence of asymmetric effects of shocks on housing volatility recorded in the studied types of dwellings and its effective performance over the Glosten et al.'s (1993) GJR-GARCH model in modelling the studied house prices' asymmetric volatility (Dufitinema, 2020). Amongst the GARCH models that accommodate the long memory in the assets' conditional variance, the selected ones are the FIGARCH and CGARCH models. The FIGARCH model allows a slower hyperbolic rate decay of shocks, making it the best candidate for explaining and capturing the high degree of autocorrelation in financial market volatility. The CGARCH model investigates the conditional variance's long-and short-run movement by decomposing the conditional variance into permanent and transistor components. Both models have been applied more often of late compare to, for instance, the Integrated GARCH (IGARCH) model (Engle and Bollerslev, 1986). The reason is that Tayefi and Ramanathan (2012) have found the IGARCH model to be too restrictive as it implicates on the conditional variance, an infinite persistence and consequently, shocks persist forever. There is an extensive collection of studies on the FIGARCH and CGARCH applicabilities to model and/or forecast different assets' volatility. In the housing markets, Milles (2011) used the CGARCH model to investigate whether there is long-range dependence in the US home price volatility. The author found that housing markets of over half of the US metropolitan areas exhibited persistent volatility. For those regions, the CGARCH model provided better forecasts than the standard GARCH model. The Milles's choice of the CGARCH was based on Maheu's (2005) Monte Carlo study, which showed that the CGARCH captured long-range dependence better than FIGARCH in equity markets. On the other hand, Feng and Baohua (2015) discovered that the FIGARCH model could well catch the long memory of the Zhengzhou house price volatility. To that end and for the models' crosscheck assessment, this article uses both FIGARCH and CGARCH models to forecast house price volatility of the considered types of dwellings.
3.2.1 Exponential generalised autoregressive conditional heteroscedasticity model. Let R t denotes the asset log-return at time t. The standard form of the conditional volatility model is as follows: where v t is the conditional mean, s t is the conditional standard deviation and e t is the error term. Given that many financial assets exhibited volatility clustering, instead of modelling the variance of the innovation e t as a constant, Bollerslev (1986) proposed a GARCH process where the conditional variance s 2 t is a function of past volatility and previous squared errors. That is, where v > 0 is the intercept, a i ! 0 (coefficients of e t-i ) and b j ! 0 (coefficients of s 2 tÀj ) are referred to, respectively, as the ARCH and GARCH parameters.To investigate the potential asymmetric effects of shocks on conditional variance, Nelson (1991) proposed the EGARCH model. The model enables negative shocks to have a distinct impact on conditional variance than positive shocks, an observation which is termed to leverage effects. Its standard specification is as follows: where a i and a i þ g i indicate, respectively, the effects of good and bad news. I t-i is the indicator function and it equals to one if e tÀ1 < 0 and zero otherwise. Implying a more sizable influence a i þ g i ð Þ e 2 tÀi with g i > 0 of a negative shock e t-i , while a positive shock e t-i have little influence a i e 2 tÀi to s 2 t . 3.2.2 Fractionally integrated generalised autoregressive conditional heteroscedasticity model. The evidence of slow decay in correlations of squared and absolute returns of financial assets gave rise to the FIGARCH model, first introduced by Baillie et al. (1996). The model adds the fractional differences in the standard GARCH process, thereby explaining and capturing the high degree of autocorrelation in financial market volatility.
The GARCH process in equation (1) can be written as: Its equivalent ARMA type representation is given by: where u t ¼ e 2 t À s 2 t . From this formulation, Engle and Bollerslev (1986) presented the IGARCH model by allowing the presence of unit root in 1 À a B ð Þ À b B ð Þ as follows: However, as discussed above, the IGARCH model is too restrictive as shocks persist forever. Hence, the introduction of the FIGARCH model, where the fractional differencing operator (2). The general form of the FIGARCH model is as follows: If d = 0, the FIGARCH model reduces to the standard GARCH, while if d = 1, it turns into an IGARCH model. 3.2.3 Component generalised autoregressive conditional heteroscedasticity model. Lee and Engle (1999) developed the CGARCH model by decomposing the conditional variance into permanent and transitory components, thereby investigating the long-and short-run volatility movements. Unlike in the GARCH process where the conditional variance reverts to a long-run constant mean v in equation (1), the CGARCH model allows a time-varying mean reversion of the conditional variance. Its specification is as follows: Equation (4) represents the long-run (permanent) component of the volatility; the timevarying mean reversion of the conditional variance. It describes how the GARCH model's intercept is now time-varying following first-order autoregressive type dynamics, and thus, captures the long memory portion of volatility. Equation (3) describes the short-term (transitory) component of the volatility, which is the difference between the conditional variance and its trend (s 2 t À q t ). To ensure the stationarity conditions, the sum of (a, b ) coefficients must be less than 1 and r < 1 for the persistence of the transitory and permanent components. If r = f = 0, the CGARCH model reduces to the standard GARCH.

Forecast evaluation
To test and compare the prediction abilities of the above-mentioned models; the data is divided into training and test set. The training set, which consists of 25 years of sample data, is used to build the models (estimation sample: 1988:Q1-2013:Q4). The test set is used to evaluate the models' predictive accuracy; it consists of 5 years of sample data (forecasting sample: 2014:Q1-2018:Q4). The forecasting process starts by estimating each model on the training data set. Thereafter, the one-step-ahead (quarter) volatility forecasts are built using the estimated model. Finally, the predicted volatility (ŝ 2 ) and the proxy of the true volatility (s 2 ) are compared.
When evaluating volatility forecasts, one has to deal with the problem that the true volatility s 2 is unobserved. Various studies have proposed the appropriate proxy of s 2 such as the squared returns (Brooks and Persands, 2002;Sadorsky, 2006). Patton (2011) discussed that squared returns are a rather noisy proxy for the true conditional variance and that a conditionally unbiased estimator of the conditional variance, the realised volatility (RV), is a more efficient estimator than the squared returns. Recently, Xingyi and Zakamulin (2018) pointed out that the usage of realised daily volatility and available intraday data provided better forecast accuracy in the stock market. In the housing market, Zhou and Kang (2011) also used realised volatility calculated from assets returns as s 2 proxy. Following this study, in this article, the true volatility is also proxied by realised volatility built as a rolling sample. Furthermore, in line with other studies on volatility forecasting, two popular metrics, namely, the root mean squared error (RMSE) and the mean absolute error (MAE), is used to evaluate the studied models' forecasting accuracy. The former metric has the benefit of penalising large errors as it gives errors with larger absolute values more weight than errors with smaller absolute values, which makes it useful when large errors are particularly undesirable. The latter metric gives the same weight to all errors. Both are negativelyoriented scores, meaning that lower values are better. The two measures are defined as follows: where N is the number of forecasts,ŝ 2 is the forecast volatility and s 2 is the true volatility.

Results and discussions 4.1 Forecasting house price returns
The ARMA and ARFIMA models' performances are compared, in each apartment category, for cities and sub-areas with no substantial clustering effects, meaning those regions with both constant mean and variance. Recall that in the one-room apartment category, there are 10 cities/sub-areas and eight of them exhibited long memory behaviour. In the two-room and larger (over three rooms) apartment categories, there are 15 and 8 cities/sub-areas, respectively. In total, 14 and 7 returns series exhibited long-range dependence behaviour in each apartment category, respectively. Table 1 reports the house price returns' best performing in-sample and out-of-sample models for each city and sub-area, in each apartment type. In Appendix, To investigate which feature (short or long memory) is crucial in the Finnish house price returns modelling, results are mixed; the two models' performances differ by apartment types and across cities and sub-areas. Firstly, in the one-room flat category, the ARMA model ranks as the leading in-sample performing model in six out of eight cities/sub-areas. Secondly, in the two-room flat category, it is the ARFIMA model, which excels in 11 out of 14 cities/sub-areas. Last, in larger (over three rooms) flat type, both models split the ranking as the ARMA model fits the house price returns best in three cities/sub-areas, while ARFIMA performs well in four out of seven cities/sub-areas. These results are in line with Jadevicius and Huston's (2015) study outcomes and Hepsen and Vatansever's (2011) recommendations. Jadevicius and Huston highlighted that the ARIMA modelling approach strongly contributes to examining housing markets. Hepsen and Vatansever pointed out that house price modelling with ARIMA provides perceptions for a range of stakeholders. Moreover, the ARFIMA model's ability to capture the long memory feature of the house price returns, notably in the two-room flat category; stresses the high persistence of house prices (Dufitinema and Pynnönen, 2020). Notes: This table reports the house price returns best performing in-sample and out-of-sample models, for each city and sub-area, in each apartment type. The "anti-persistent" refers to the series with long-range negative dependence, meaning that their estimated fractional differencing parameter d varied from À0.5 to 0 The out-of-sample forecast performance of the two models is investigated by estimating the models on the training data set, generating 5-year returns forecasts and validating the constructed predictions using the test set. Generally, in all three apartment types, the ARFIMA model outperforms the ARMA in most regions. The ARFIMA model provides the best returns forecasts in 5 out of 8, 10 out 14 and 5 out of 7 cities/sub-areas in the one-room, two-room and larger (over three rooms) flats categories, respectively. Given the strong evidence of long memory found in the Finnish house price returns by Dufitinema and Pynnönen (2020), these results confirm again the long memory models' ability to capture these long-range dependencies and their superiority in forecasting house price returns. In the two-room apartment category, an interesting observation emerges, the best in-sample performing model also produces accurate out-of-sample forecasts. This remark is noted in 11 out of 14 cities/sub-areas. On the one hand, it contradicts previous studies, which expressed that a better in-sample fit does not automatically suggest a superior forecasting performance (Newell et al., 2002;Stevenson and McGrath, 2003). On the other hand, the remark aligned with Jadevicius and Huston's (2015) findings that the same model [ARIMA(3,0,3)] provided superior in-and out-of-sample modelling results for the Lithuanian housing market. In summary, regarding modelling the Finnish house price returns, the short or long memory model's performance is driven by the house price data set under study. Therefore, across cities and sub-areas, one must enable different house price dynamics instead of imposing one model on the full data set. With respect to forecasting house price returns, the long memory models outclass their short memory peers. This result highlights the advantage of long memory models in forecasting different asset prices.

Forecasting house price volatility
For regions with time-varying variance, meaning those cities and sub-areas with substantial ARCH effects, short and long memory GARCH models are compared. Those are the EGARCH, FIGARCH and CGARCH models. Table 2 reports the house price volatility' bestperforming in-sample and out-of-sample models for each city and sub-area, in each apartment type. In the Appendix, the models' in-sample fits are detailed in Table A3 and their RMSE and MAE forecasting accuracies in Table A4.
Mostly, the best-ranked model for the Finnish house price volatility modelling, in all three apartment types, is the EGARCH model. It comes on top in 17 out of 28 cities/sub-areas exhibiting clustering effects in the one-room flat category. It leads in 19 out 27 and 23 out 31 cities/sub-areas in, respectively, two-room and larger (over three rooms) flat categories. These outcomes are in line with Dufitinema's (2021) findings, who underlined, using the Stochastic Volatility framework, that the stochastic volatility model with leverage effects was also the leading in-sample performing model for the studied type of dwellings. The results also highlight, once more, the importance of asymmetric volatility features in modelling house price volatility. In the rest of the regions, the FIGARCH model alternatives with EGARCH and takes the lead. This pattern is noted in 11, 6 and 7 cities/sub-areas in the respective flat categories. The exceptions of this general pattern are Turku and Vaasa cities in the two-room apartments and Jyväskylä-city in the category of larger (over three rooms) apartments, where the CGARCH model excels in comparison to the other two models.
The out-of-sample forecasting performance of the three models is examined. The forecasting exercise starts with an estimation of the models on the training set. Next, using the estimated models, 5-years volatility forecasts are generated in the form of one-step ahead. Finally, the built predictions are validated on the test set. Mostly, the long memory GARCH models overcome their short memory counterparts in all three apartment types. The CGARCH model provides the superior forecasts in, respectively, 14 out of 28, 11 out of 27 and 13 out of 31 cities/sub-areas in the one-room, two-room and larger (over three rooms) flats categories. The FIGARCH model follows with superior performance in 10, 9 and 10 cities/sub-areas in the respective flat categories. These findings are consistent with Milles's (2011), who concluded that the CGARCH provided better forecasts than the standard GARCH for the US home price volatility. Moreover, Lee and Reed (2014), in regard to the Australian housing market, also acknowledged the CGARCH model's ability to decompose the price volatility into "permanent" and "transitory" components. And thereby, be a better candidate to capture the short-and long-run movements of volatility.
Note: This table reports the house price volatility best performing in-sample and out-of-sample models for each city and sub-area, in each apartment type A regional pattern is noted in few regions where the same model produces better out-ofsample forecasts in all three apartment types. In Tampere-area1, the FIGARCH is the leading model throughout all apartment types, while the CGARCH model stands out in Lahti-city. These results suggest that the house price volatility of the former region is characterised by a significant degree of autocorrelation. While the conditional variance of the latter city includes two components (permanent and transitory). In summary, for a larger number of Finnish cities and sub-areas, the EGARCH model is the best model for modelling their house price volatilities. In the remaining regions, the EGARCH switches places with the FIGARCH model. However, no geographical is noted; the performance of the model varies from region to region. Hence, again as above, when modelling house price volatility, one must enable different house price dynamics across cities and sub-areas and types of apartment. Regarding the models' out-of-sample forecasting performances, the long memory models (CGARCH and FIGARCH) take the lead, dominating their short-memory counterparts. Apart from few regions (one city and one subarea), the models' forecasting performances vary across cities and sub-areas and by type of apartmentno geographical or regional pattern is noted.

Conclusions, implications and further research
Over recent years, housing market forecasting has been the theme of extensive research due to the vital role of house price forecasts in asset allocation, consumption, investment, policy decisionmaking and also in predicting mortgage defaults. This article determines, in the Finnish housing market, which model is best able to forecast movements of both house price returns and volatility. The two competing models are the ARMA model and ARFIMA model for house price returns. For house price volatility, the EGARCH model is competing with the FIGARCH and CGARCH models. The study uses quarterly house price indices for 15 main regions in Finland, spanning from the first quarter (Q1) of 1988 to the fourth quarter (Q4) of 2018.
There are several important findings. Firstly, to investigate whether the short or long memory feature captures the house price returns movements, the models' performance is driven by the house price data set under investigation. In contrastingly, the ARFIMA model tops in the house price returns forecasting; it outperforms the ARMA model in most regions. This result indicates that the long-range dependencies that house price exhibits are a crucial component in their forecasting. Secondly, the EGARCH model ranks as the leading model for the Finnish house price volatility modelling, highlighting the importance of asymmetric volatility in the house price volatility modelling. The long memory GARCH models (CGARCH and FIGARCH) outperforms the EGARCH in forecasting the house price volatility, indicating the long term dependence in house price volatility and the ability of long memory models to capture and predict this property of house price volatility. Last, in all three apartment types, no geographical or regional pattern is noted for models' in-sample fit; each model's performance varies from region to region for both house price returns and volatility. For the out-of-sample analysis, however, some interesting observations emerge. For house price returns, especially in the two-room flat category, the same model provides the best in-and out-of-sample forecasts. While for the house price volatility, in two regions, the same model comes on top across all apartment types.
These outcomes have some vital housing investment and policy implications. For consumers, investors and policymakers, who monitor the house price volatility and whose decisions are based on future house price movements, accurate forecasts help their decision-making. Moreover, precise predictions are essential for housing investment risk assessment and are more significant insights for portfolio allocation across Finland and apartment type. Additionally, as interlinkages have been found between housing markets and the economic cycle of various developed countries, a view into house prices outlook would be beneficial for economists and policy institutions. Also, as pointed out by Balcilara et al. (2015), forecasting housing market movements plays a significant role in monetary policy authorities and their willingness to "lean against the wind".
Furthermore, as housing has been found to play a crucial role in macroeconomic factors fluctuations (Kishor and Marfatia, 2018), it would be of interest to investigate the interaction between house prices and the variables such as unemployment rates and interest rates from region to region. The information from these macroeconomic predictors can be further used to improve the forecast accuracy. In the same viewpoint, the existence of the structural break in the studied housing market merits an examination. In this aspect, the data can be split into subsamples supported by the break dates and thereby improving forecast accuracy. Notes: This table records, for every city and sub-area, the estimated Akaike information criteria (AICs) for model comparison. The favourable model is the one witd the minimum AIC value. The "anti-persistent" refers to the series with long-range negative dependence, meaning that their estimated fractional differencing parameter d varied from À0.5 to 0. The best model's values are marked in bold