## Abstract

### Purpose

Establishing the strength of a novel variable–mortgage debt as a fraction of US gross domestic product (GDP)–on forecasting capitalisation rates in both the US office and multifamily sectors.

### Design/methodology/approach

The authors specify a vector error correction model (VECM) to the data. VECM are used to address the nonstationarity issues of financial variables while maintaining the information embedded in the levels of the data, as opposed to their differences. The cap rate series used are from Green Street Advisors and represent transaction cap rates which avoids the problem of artificial smoothness found in appraisal-based cap rates.

### Findings

Using a VECM specified with the novel variable, unemployment and past cap rates contains enough information to produce more robust forecasts than the traditional variables (return expectations and risk premiums). The method is robust both in and out of sample.

### Practical implications

This has direct implications for governmental policy, offering a path to real estate price stability and growth through mortgage access–functions largely influenced by the Fed and the quasi-federal agencies Fannie Mae and Freddie Mac. It also offers a timely alternative to interest rate-based forecasting models, which are likely to be less useful as interest rates are to be held low for the foreseeable future.

### Originality/value

This study offers a new and highly explanatory variable to the literature while being among the only to model either (1) transactional cap rates (versus appraisal) (2) out-of-sample data (versus in-sample) (3) without the use of the traditional variables thought to be integral to cap rate modelling (return expectations and risk premiums).

## Keywords

## Citation

Larriva, M. and Linneman, P. (2022), "The determinants of capitalisation rates: evidence from the US real estate markets", *Journal of Property Investment & Finance*, Vol. 40 No. 2, pp. 119-169. https://doi.org/10.1108/JPIF-12-2020-0140

## Publisher

:Emerald Publishing Limited

Copyright © 2021, Matt Larriva and Peter Linneman

## License

Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode

## 1. Introduction

At 16 tn dollars, the value of commercial real estate in the United States (NAREIT, 2019) represents half of a percent of the world's total wealth (Shorrocks *et al.*, 2018). And at 65% home-ownership in the United States (The St. Louis Federal Reserve Bank, 2020c), real estate represents a more invested-in asset class than equities–only 55% of Americans own stock (Saad, 2020). As such, the value of this asset class and the underlying determinants are of importance not only to owners and operators but also to the economy as a whole.

While ratios (price to earnings, debt-to-equity) are commonplaces in most value discussions, there is perhaps no other industry that relies on such a singular metric as real estate does on capitalisation rates (cap rates).

Its definitions and derivations abound, but in its simplest form, a cap rate is the quotient of expected annual net operating income (NOI) and the current value of a property:

This quotient is often used to calculate the value of an asset by dividing its next 12-months NOI by the cap rate itself. Despite its simple calculation, it contains myriad information about present and forward economic conditions. For example, a cap rate may compress because forward growth is expected to be high, while an expanding cap rate could indicate a more pessimistic view of future cash flows. By any calculation or formulation, understanding the current and future cap rates provides a great deal of insight into the nature of property valuations– so much so that it is frequently used instead of price when discussing market valuations.

Because of this, cap rate forecasting and modelling remain active areas of research in the field of real estate finance Lin (2019), Henig *et al.* (2019) and Christopoulos *et al.* (2019). Taken in context of the current cap rate environment and the economic slowdown, this question becomes especially pointed. Since 2009, cap rates have been on a near monotonic down trend (representing increasing real estate values). And while they have slowed their decline since 2015, the question of their direction is topical, with some investors reasoning that stymied rent-growth figures will push up cap rates while others suggest low interest rates will force cap rates lower still.

The research which attempts to explain cap rates can largely be divided into three camps: those which use lagged return (or price change) as a variable, those which use ratios such as rent-to-price or price-to-income and those which use more granular property or regional data. Underpinning many of these studies is the Gordon (1962) specification of cap rates, which asserts their calculation as the constant cost of equity capital minus the growth rate of the investment. We note that most of the studies use variables that focus on the risk (unemployment, volatility), return (price-to-income) or expectation (future premiums or rent-growth rates) components of the valuation equation. Few focus on the demand side of the valuation equation.

Separately, as to the studies themselves, most focus on in-sample modelling of appraisal cap rates. In-sample modelling is very telling, but a model robust to both in-sample and out-of-sample forecasting may be more so. As to appraisal-based cap rates, practitioners widely note the inherent unrealistic smoothness and aptly named appraisal bias.

With this background in mind, we seek to investigate the determinants of cap rates from the demand side of the equation, and we aim to do so using transaction-based cap rates, both in-sample and out-of-sample. Specifically, we focus on the US Office and Multifamily markets. While we believe the results generalise to other sectors, a full exploration is beyond the scope of this analysis.

To address the demand side of the valuation equation, we analyse the total nominal mortgage debt outstanding as a percent of nominal US gross domestic product (GDP) (funds flows). Our theory is that this variable is a more direct synthesis of all other variables (risk, return and expectation) and is a more direct influencer of cap rate movements. A surge in demand for mortgage debt (as a percent of GDP) should accompany a period of cap rate compression, while conversely, a surge in GDP that dwarfs mortgage debt may represent a period of high opportunity cost of real estate and a devaluation of the asset class.

To address the transactional cap rate series, we use Green Street data, which starts in the 1990s, providing ample testing periods. Historically, practitioners would use National Council of Real Estate Investment Fiduciaries's (NCREIF) appraisal-based cap rate series, as this was the only source of long-term data. Green Street is among the most respected names in the REIT and Real Estate research space and has collected transactional cap rates since inception, presenting another strong source of data.

With this data, we use a vector error correction model (VECM), and we assert that this method is a superior choice (compared to vector autoregression (VAR)) for modelling cap rates, as there is significant information contained in the levels, which is lost in the differences (required for VAR), of the input series. Consider the difference in implications of a change in cap rates from 10% to 9% versus a change of the same magnitude from 4% to 3%. We confirm the appropriateness of a VECM by testing Granger causality and direction (funds flows Granger cause cap rates), auditing stationarity, establishing the number of cointegrating vectors, setting lag and rank order and confirming non-autocorrelation of residuals. Despite the numerous conditions that must be met to utilise a VECM without generating biased modelling, our input data meets these conditions, and our results benefit from the robustness.

We model cap rates in both the office and multifamily sectors both in-sample and out-of-sample. To generate out-of-sample models, we use a VECM which is trained on historical data, which attempts to explain the next 1, 2, 3 and 4 quarters.

We then compare the results of our out-of-sample modelling (which uses only historical samples to predict unseen future cap rates) to a baseline model composed of the traditionally used variables: risk-premium, return expectations and the cap rates themselves.

We have two main findings that offer insight to the nature of cap rate determinants. First, we confirm the robustness of using a VECM to explain transactional versus appraisal cap rates, both in-sample and out-of-sample, both with our new variables and with the traditional ones. The volatility of the realistic transactional series is captured by the flexibility of the VECM model. And our findings with both sets of variables are strong relative to those in the literature. Second, we submit that a funds flow variable is highly explanatory, accounting for the vast majority of in-sample and out-of-sample variance in cap rate series. The models using the traditional variables are outperformed by the model using the funds flow variable. Variables of this nature are mentioned rarely in the literature and perhaps provide necessary guidance in volatile and frothy valuation environments where traditional spreads, expectations and returns are out of historical norms.

To put our paper in context, we are the first to prove the power of the funds flow variable (discussed by Linneman, 2015) in explaining and forecasting cap rates in a statistically robust context. And we are the first to forecast out-of-sample transactional cap rates (versus NCREIF's appraisal-based cap rates). In variable selection, our research is in the vein of Chervachidze *et al.* (2009) who used total net debt as a percent of GDP, the spread of Moody's AAA bonds to the US ten-year treasury and lagged cap rates (among other factors) to forecast cap rates. And our work is somewhat related to Arsenault *et al.* (2013), which investigates a positive feedback loop between commercial mortgages and real estate appreciation. Their work uses quarterly outstanding commercial and multifamily mortgages reported by the Fed and examines the difference in these values, period over period. We confirm the efficacy of the Fed's mortgage data, but find it is more relevant not when differenced, but when analysed at levels, and as a share of GDP.

Our results have a timely implication for policy. Given cap rates can be accurately described, in- and out-of-sample, according to the flow of funds into the market, there exists a path for government stabilisation of cap rates via Frannie Mae and Freddie Mac or through Commercial Mortgage Backed Security (CMBS) purchases by the Fed. Through a combination of actions of the quasi-government agencies or the Fed itself, valuations of the US real estate market can be targeted, with precision. This could eliminate much of the guess-work in real estate valuation stabilisation and provide policymakers with a specific, targeted percentage of mortgage debt as a percentage of GDP, which would ensure stable, increasing values, much as the Fed controls the money supply to implement control of market prices.

The remainder of this paper is organised as follows: we examine the state of the art in use of multivariate time series modelling of real estate values and cap rates; we then discuss the specific data series we found most powerful in forecasting and explaining cap rates; next, the analysis discusses the selection, specification, and appropriateness of a VECM in forecasting these cap rate series using our three variables; and we conclude with the results of our model, compare these to a baseline and provide a conclusion.

## 2. Literature review

Because of the relationship between the cap rates and real estate returns, we expand the discussion of existing work to include studies that explained and forecasted cap rates and studies that explained and forecasted derivative metrics including returns, prices and cap rate spreads. The literature in the space is extensive, and we direct those interested in a full account to Forecasting Real Estate Prices (Ghysels *et al.*, 2013) which divides forecasting methodologies into three camps: those which use lagged return (or price change) as a variable, those which use ratios such as rent-to-price or price-to-income and those which use more granular property or regional data. We divide our literature review roughly chronologically, devoting sections to early methods of forecasting, two popular time-series methods and finally more recent alternate methods.

### 2.1 Forecasting before multivariate time-series methods

Prior to the mid-1990s, multivariate time-series methods were not in wide use in real estate. Serial autocorrelation extant in real estate data made forecasting using lagged values a statistical challenge. This challenge was largely addressed by either single-variate autoregressive methods (Gau, 1984, 1985; Linneman, 1986) or variations on ordinary least squares (OLS) that used multi-stage estimation (Case and Shiller, 1990; Abraham and Hendershott, 1994). The former was aimed at establishing weak-form market efficiency rather than prediction, while the latter was generally focused on forecasting asset prices within metropolitan statistical areas (MSAs) over short periods of time.

The variables used in these methods of forecasting tended to fall into classes that line up with the components of a cap rate: risk-free rate, real interest rates, relative return expectations and risk. Many models' variables included a government long bond to capture the risk-free and real rates, an economic spread (credit or pricing-based) to capture the relative expense or return of real estate, and macroeconomic factors to capture the growth prospects and risk in the marketplace. Some models included a term to capture forward estimates including either sentiment or expected rent or NOI growth. This lines up with the Gordon growth model of pricing cap rates as the discount rate less the perpetual NOI growth rate.

### 2.2 Forecasting using vector autoregression

The first published instance of VAR in real estate forecasting seems to have come from Geltner and Mei (1995). Their work produced robust forecasts of REIT and NCRIEF total returns by using lagged values of NOI, property market total returns, income returns of the NCRIEF index, and the total returns of the NAREIT All-REIT index (see Tables 1–22). They found an *R*-squared of 0.843 in forecasting the NCRIEF total return one year forward, using annual data (their Table 2).

Dua *et al.* (1999) used a Bayesian VAR framework to forecast home sales and found it significantly outperformed an unrestricted VAR model. They report accuracy in terms of the Theil-U statistic as opposed to mean absolute error, but also found significant predictive power.

Fratantoni and Schuh (2000) created a method based on VAR to establish the heterogeneity of regional housing markets but did not attempt to forecast pricing or returns. Instead, the work modelled regional income, housing investment and house appreciation as a function of monetary policy tightening, housing starts and state-level income.

Brooks and Tsolacos (2001) used a VAR method to predict the returns of a public UK real estate property index using the term structure of interest rates and the gilt-equity yield ratio. Brooks found that predictability was robust at one-period forward, but was weaker than forecasting the long-term average, reporting mean absolute errors of 0.18, or 32% of the long-term mean of the variable being predicted (his Table 5).

An and Deng (2009) provide one of the only examples of pure cap rate forecasting, as opposed to return, price or spread forecast. They modelled cap rates through forward-looking expected returns and rental growth using both a VAR model and a structural model estimated with a Kalman filter. Their VAR model uses lagged values of cap rates, property returns, the risk-free rate, NOI growth, and occupancy and achieves an adjusted *R*-squared of 0.95 in forecasting in-sample NCRIEF data from for the All Property cap rates from 1990Q1 to 2008Q2 (their Table 6). The results of their Kalman filter model appear strong, but they do not report either *R*-squared or mean absolute error. The paper does not address the cointegration of the variable terms.

Contemporaneously, research emerged focussing on MSAs in specific as opposed to aggregated cap rates metrics. Campbell *et al.* (2009) forecasted the change in the return premium to housing using a VAR method, including national rent-growth and premiums, local and national income growth, employment growth and population growth. Their model produced in-sample results over the quarters from 1975–2007 of 0.47 at the 25th percentile, and 0.59 at the 75th percentile (their Table 4). The estimates were for individual MSA's hence the quantile distribution.

### 2.3 Forecasting using vector error correction method

VECM was used concurrently with VAR with somewhat dichotomous utilisation. Some researchers focused on establishing relationships between variables (to illustrate diversification potential between markets or mean reversion) while others focused on predictive power.

The earliest work to use error correction on VAR was Zhou (1997) who specified two-variable VECM for forecasting single-family sale prices and sales volume. He found an out-of-sample *R*-squared of 0.86 for forecasting one-month-ahead sales prices [his exhibit 10.2]. This study is noteworthy for reporting one of the highest *R*-squared values and one of the only out-of-sample forecasts. It is also among the first studies to note the cointegration of real estate markets with other markets and variables.

Similarly, Tuluca *et al.* (2000) investigated the dynamics of the returns of public and private real estate markets using VAR and VEC model, and ultimately concluded that VECM provides a superior prediction model for the returns of real estate versus VAR. The model produced an in-sample *R*-squared as high as 0.17 for public real estate and 0.70 for private real estate, estimating eight-quarters forward (their Table 1). Accuracy decreased substantially for the out-of-sample data, and the mean absolute percentage error (MAPE) is reported as 53% (their Table 6).

As of 2005, the only study to use error correction in forecasting cap rates (versus prices, returns or spreads) was McGough and Tsolacos (2001), which found the error term insignificant. Hendershott and MacGregor (2005) updated this in 2005. They found the significance of including an error correction method in forecasting cap rates in the UK. The study showed high *R*-squared in office and retail cap rate forecasts, in both the near-term and long-term, although the statistics indicating serial autocorrelation in the residuals were not within significance (their Tables 3–8). Corroborating this study was, Clayton *et al.* (2009) who developed a VEC specification that produced an in-sample *R*-squared of 0.84 over the period Q2 1996 through Q2 2007 when forecasting cap rate changes one-period forward. Their study differentiated itself by including sentiment.

Since that time, work using VECM and cap rates has been focused less on forecasting and more on establishing cointegration relationships.

A seminal work in 2012 from Hoesli and Oikarinen (2012) uses VECM to show the relationship between REITs and direct real estate investments and proposes that the diversification via real estate is available via REITs, contradicting Anoruo and Braha (2008) and Liow and Yang (2005).

Other more recent work has included investigations on foreign buyers' impacts on cap rates (McAllister and Nanda, 2016) and analysis of the cointegration between lumber futures and timber land value (Clements *et al.*, 2017).

### 2.4 Recent forecasting using alternative methods

Finally, non-VECM and non-VAR methods have also made recent contributions in the field of cap rate forecasting, especially internationally. In countries where volatility is less consistent or where historical data are more difficult to access, these alternate methods offer unique insight into property value forecasts. The advantage to using non-VAR or non-VECM methods is the ease of model specification and the comparatively little data needed. Of course, the challenge with processing time series data in a linear model usually manifests as autocorrelated residuals stemming from autocorrelated input series.

Emerging markets’ valuations were well forecasted using multivariate regressions despite the lack of transparent pricing data and the need to incorporate sometimes large risk premiums (Dasgupta and Knapp, 2008). The authors do not discuss this, and present instead their models' *R*-squared and their coefficients' *t*-statistics. Further concern arises from the small sample size–25 observations– and very low degrees of freedom resulting from upwards of 7 terms' specifications.

Chervachidze and Wheaton (2013) address both the (autocorrelation) errors and degrees of freedom by using an MSA-specific panel regression with a fund-flow variable. This was used to generate in-sample *R*-squared values of 0.86 for Multifamily and 0.629 for Office. Their analysis focuses on periods from Q4 2000 to Q4 2009 and captures much of the cap rate compression and sharp increase during the global financial crisis (GFC). The research also nods to the oft-heard criticism regarding NCREIF cap rates, specifically that they are not based on actual sales transactions.

Lee *et al.* (2014) navigate the autoregressive tendency of the error variance by assuming it is an autoregressive–moving-average (ARMA) process and fitting a generalised autoregressive conditional heteroskedasticity (GARCH) model quite well. This addresses both the autocorrelation of the time series variables and the heteroskedasticity of the non-stationary data. They achieved high in-sample accuracy in forecasting cap rate spreads (defined as IRR less expected NOI growth rates plus some error) rather than cap rates or differences, by using expected rent-growth and trailing NOI growth. Over the 63 quarters of data from 1997:Q1 – 2012:Q3, their study achieved an in-sample adjusted *R*-squared of 0.92.

Another approach to avoiding the residual autocorrelation is to forecast single point-in-time metrics as done in Moreira *et al.* (2016). This micro-level study successfully identified over-investment in a neighbourhood of Portugal by creating a multivariate regression of a single period. By producing a model to examine the rental incomes and cap rates, the work concludes that non-differenced values can be useful metrics in explaining cap rates cap rates.

Finally, new research using machine learning techniques offers another alternative to the statistical challenges of time series. Specifically, random forest models can be implemented to forecast future values without concern for the autocorrelation of the input because the model does not assume uncorrelated residuals or independent input. This framework is beginning to find application in the research, in appraisal settings as demonstrated in Kok *et al.* (2017).

Recently, relatively few well-cited analyses have been published on cap rate forecasting. As this is the primary method of quoting real estate yields and indeed values of real estate assets, this dearth of visibility warrants additional insight.

### 2.5 A benchmark for accuracy

One of the goals in this review was to establish a benchmark for cap rate accuracy. We consider the method used to produce the forecasts, if the cap rates were forecasted in-sample or out-of-sample, if the underlying series was appraisal or transactional cap rates, and if the forecast period includes the volatility of the GFC.

In summary of the available research, we find the highest accuracy (in terms of *R*-squared) comes from An and Deng (2009) who use a VAR to forecast in-sample, appraisal cap rates, ending in Q1 2008. They achieve an adjusted *R*-squared of 0.95.

We take this model as our baseline model for comparison, and we use the variable types established therein to contrast the explanatory power achieved in comparison to our novel approach. We choose this work for three reasons: it has proven reliable over a long duration–18 years; it achieves a very robust measure of fit and it is a dynamic model, like our own.

Still, in review of the research we find is a lack of out-of-sample, long-duration, transactional cap rate models. Thus, there remain open questions as to the ability to forecast out-of-sample (versus in-sample) cap rates, transactional (versus appraisal) cap rates and modern values (since and including the GFC). Our work seeks to address these questions. This challenge is well-motivated but also more difficult. Out-of-sample forecasting is generally less accurate than in-sample. Appraisal cap rates are more smooth (have less variance to explain) than transactional cap rates. And finding a single model dynamic enough to explain the cap rate volatility during the GFC and the recent tranquillity is indeed a challenge. That said, we offer a promising method that does just that, building on some characteristics substantiated in the prior literature.

### 2.6 Model similarities

Our model uses a VECM model with a novel fund flow variable (described in detail in the following Data Section). This variable compares the volume of outstanding mortgage debt as a percent of GDP, and builds on the work of others who have noted the importance of debt in forecasting cap rates.

Chervachidze *et al.* (2009) used total net debt as a percent of GDP, the spread of Moody's AAA bonds to the US ten-year treasury and lagged cap rates (among other factors) to forecast cap rates. This work was among the original to substantiate the role of a broad debt metric, and it introduced the scaling of it by GDP. We build on their work of successfully identifying the relevance of debt variables, and we further this and find higher explanatory power by specifying the novel fund flow variable.

In 2013, Chervachidze and Wheaton (2013) revisited the same debt variable in the context of its relevance in explaining cap rates throughout the GFC and found that it was a significant variable still.

Arsenault *et al.* (2013) investigated a positive feedback loop between commercial mortgages and real estate appreciation. Their work uses quarterly outstanding commercial and multifamily mortgages reported by the Fed and examines the difference in these values, period over period. We too find value in the Fed's mortgage data, but find it is more relevant not when differenced, but when analysed as its share of GDP–drawing from Chervachidze *et al.* (2009) and Chervachidze and Wheaton (2013). While Arsenault *et al.* used commercial and multifamily mortgages, we opted to use all mortgage debt outstanding, in the goal of creating a singular model that would generalise to multiple sectors, without the need for different input data, other than that series' cap rate itself. Arguably, myriad additional factors could be added to differentiate the office model from the residential model, including their respective portions of debt, but the research is robust with models using five-and-more variables, so we instead selected a more parsimonious method.

Our work contributes to the literature in four ways. First, it statistically substantiates the importance of fund flows (mortgage debt as a percent of GDP), as discussed in proprietary work by Linneman, as a determining factor of cap rates. Second, we offer a new standard for out-of-sample *R*-squared accuracy in cap rate forecast for one-quarter forward, and we offer a benchmark for 4-quarter forward forecasting, in contrast to the one-quarter forward forecasting of most existing models. Third, we produce these stronger forecasts without using any of the data thought integral to cap rate forecasts (return expectations and risk premiums), challenging the relative strength of these variables. Finally, we execute this analysis on a novel-to-the-research cap rate set, from Green Street, which more closely mirrors the volatility and true movement of real estate pricing, as opposed to smoother and slower-moving appraisal indices.

## 3. Data

The following section discusses the data used to construct our VECM model. Briefly, the categories of variables are funds flow, historical cap rates and unemployment. But to gauge the strength of the proposed funds flow models, and to assess the contribution of the variables versus the contribution of the VECM model, we create a baseline model as well. The baseline model is structured after the VAR developed by An and Deng (2009). It uses three categories of variables mentioned often in the literature, proven to explain a significant part of cap rate variance: historical cap rates, return expectations and risk premiums. The historical cap rates in the baseline and the novel model are identical.

### 3.1 Novel method

Our choice of data is well-founded in economic theory. At their essence, goods are priced according to their former prices, current supply and demand dynamics and future expectations about the price of the good. We use one variable to capture each component. In brief, we use cap rate series to model former values, unemployment to risk and mortgage debt as a portion of nominal GDP to model supply and demand dynamics. Descriptive metrics for the series are given in Tables 21 and 22.

The novel variable–“funds flows”–is modelled as nominal mortgage debt divided by nominal GDP. Although the variable itself is a point-in-time metric, we use the word “flows” because our use of the variable in the VECM model implicitly considers both its level and its recent change. This is necessary, as the implications of an increasing ratio (rising debt or decreasing GDP) are very different when the ratio is already high as opposed to when it is rebounding from a cyclical low. Mortgage debt in this case is mortgage debt outstanding from all holders, as reported by the St. Louis Fed (The St. Louis Federal Reserve Bank, 2020b). GDP is the nominal, seasonally adjusted, annualised figure, provided from the same source (The St. Louis Federal Reserve Bank, 2020a).

This variable makes intuitive sense and is substantiated by a thought experiment: if you were seeking to forecast real estate values one-year forward, would you rather know what the prevailing interest rate would be or would you rather know the amount of capital seeking real estate? We posit the later would provide the better signal of values' direction. The funds flow variable provides just that: a view of the amount of capital invested in real estate, and its recent change.

Now, common objections could be that this variable does not account for the change in supply; or the variable could send false signals, as during the GFC (when the variable was high immediately prior to values sharply declining) or when GDP compresses and sends the ratio higher. These are worth addressing.

First, the argument regarding supply is valid in theory: a surge in supply would depress values if it exceeded the growth in capital seeking to invest. But surges in supply, when considering the whole of US real estate, are non-extant. Certainly there are changes in supply, but in the office sector, the largest year-over-year increase in supply was 1.5% (The CoStar Group, 2020). Similarly, the largest year-over-year change in supply in the multifamily sector was 1.8%. By contrast, the largest change in the funds flow variable was 7%.

Second, as to false signals, we find few. The beginning of the GFC saw very high portions of real estate debt as a percent of GDP, but values were about to collapse. While the high reading of the funds flow metric would not have captured this, the slowing of growth of the variable (the derivative) did portend the collapse. Funds flows were growing by 3% per year in 2007, but only 1% per year in 2008. By using a VECM, we capture both the value of the fund flows but also deviations from equilibrium which precludes false signals. Another potential for false signals comes from a decline in the denominator (lower GDP) which would send the ratio up both in value and in derivative. In times of GDP compression, the value of debt seeking real estate actually declines faster than GDP, causing the variable to remain accurate.

Finally, the Fed does offer more granular slices of mortgage debt outstanding, including commercial and individual mortgage debt outstanding, residential and non-residential. We elected to use the overall debt outstanding instead of the more granular series for three reasons. First, we were seeking a model which could forecast multiple cap rate series without respecification, as VECM models are arduous to specify correctly. By using both unemployment rates and mortgage debt, the only series that needs to be substituted to forecast other series is the cap rate themselves. Second, we would like to leave the model adaptable to forecast series for which there is no specific debt series (Industrial, Retail). Finally, the numerator of the debt variable (all mortgage debt outstanding) only changes mildly in its composition from year-to-year. That is, it does not go from 10% residential to 50% residential year-over-year. Thus, an increase of 1% in the overall value of debt outstanding likely represents a 1% increase in the constituent series values.

While the funds flow variable is certainly not the first attempt in the literature to model the demand side of the equation, we posit that it has an advantage over other choices, namely expectation-based variables (i.e. future rent-growth forecasts or supply-growth forecasts). The simple reason is that mortgage debt as a percent of GDP is a point-in-time spot variable, while expectation-based variables are forecasted variables. Forecasting cap rates based on, say, forecasted rent rates compounds the already difficult issue of cap rate forecast by requiring accurate rent-growth forecasts as an input. Instead of forecasting using a forecast–we opt to forecast using an observable metric which we find has a strong leading relationship to the target variable: cap rates.

Our cap rate series are sourced to Green Street and are derived from the current market values of REIT-owned properties. The cap rate represented is:

The unemployment values used are sourced to the US Fed. We transform the series from its reported version (in whole numbers labelled as percentages) to percentages. We then square the series which accomplishes two objectives: first, it reduces the *p*-value of the test of stationarity at first-differencing; second, it ensures the data generating process is near-normal (discussed in detail in the following section).

We find unemployment to be a proxy of an overall economic situation and thereby a representation of risk. The unemployment variable, unlike the long-bond has pronounced and responsive cycles and seems to be less influenced by the Fed (despite the dual mandate) than long-term interest rates. Using the long bond and the ten-year note as measures of risk seems to have been popular in models before, but since the 1980s, the interest rates have been on a near monotonic down trend. This renders them less useful in modelling scenarios than the more responsive unemployment series.

Our data begins in 1993 and runs quarterly for 109 periods, through 3Q 2020.

### 3.2 Baseline method

The baseline model is structured after the VAR developed by An and Deng (2009). Their VAR model uses lagged values of cap rates, property returns, the risk-free rate, NOI growth and occupancy and achieves an adjusted *R*-squared of 0.95 in forecasting in-sample NCRIEF data from for the All Property cap rates from 1990Q1 to 2008Q2 (their Table 6).

Specifically, they use a mix of *ex ante* and ex-post data covering cap rates, returns, risk-free rate, historical growth and vacancy.

Our return expectation metric is sourced once more to Green Street, and it is understood as a forecasted IRR for an investment in REIT-quality properties. The series, respectively constructed for Office and Multifamily investments are calculated using the current cap rate, the four-year forward NOI growth estimate, a long-term growth rate, and near and long-term risk adjustments.

The risk premium series is calculated as the difference between the cap rate–understood in this context as the yield on the property–and the ten-year treasury. This is calculated for each series respectively.

Summary metrics are provided in Table 22.

## 4. Selection, specification and analysis of VECM

The motivation for selecting a VECM stems from three characteristics of this financial data. First, we believe the values themselves—the values at level—contain information that we do not wish to discard by differencing to induce stationarity. Further, we observe a common trend in the series. Finally, we have domain-based reason to suspect an equilibrium between the series.

Financial variables are often nonstationary (Huang *et al.*, 2003; Abu-Mostafa and Atiya, 1996). They contain time-variant first and second-order moments owing to either trends or variance that change over time. Because many time series analysis methods ideally require stationarity, a common response is to induce stationarity by first or second-order differencing. After this transformation, it is possible that much of the relevant information is stripped from a series (De Prado, 2018).

That is, if the value of a series contains significantly more information than its first or second-order difference, then retaining the non-differenced series can lead to more robust estimators. One can see this being the case with financial variables like interest rates whose dynamics behave very differently after a move from 10% to 9% than after a move from 1% to 0% even though both differences are only 100 basis points. So to with cap rates which have a zero lower-bound, making their behaviour at 10% far different from their behaviour at 2%. Perhaps even more illustrative, consider a change in unemployment from 20% to 19% and a change of unemployment from 3% to 2%. The former may belie a measurement error in a deep recession while the latter might indicate a surging economy on the brink of overheating. A first-differenced model would consider both the same.

Furthermore, many related financial series that are *I*(1), that is, stationary at first differencing, have common stochastic trends (Lütkepohl, 2005). These stochastic trends may be observed as interest rates increasing as employment increases. These common trends can lead to an equilibrium where temporary dislocations can be modelled and forecasted. Although the variables may move stochastically, if there is a linear combination of the variables that sums to zero they are cointegrated variables, and this cointegration can reveal additional information useful for forecasting. Because differencing for stationarity can distort input and distort relationships between variables, we opt for a VECM, which allows for these unique features of our financial data: its nonstationarity but value at levels, its common trends and its equilibrium dynamics (Lütkepohl, 2005).

The drawback of using a VECM is that it has myriad parameters and tests which need to be passed in order to confirm its usage will not produce biased results. The steps and outcomes are outlined below.

### 4.1 Stationarity

Stationarity at first differencing is a primary requirement of the VECM model. To test stationarity, we use the augmented Dickey–Fuller test, which examines the characteristic equation for solutions on the unit circle. The series in our model are all stationary at first differencing with the exception of the office cap rate series. The office cap rate series is stationary both at levels and second differencing but oddly becomes nonstationary at first differencing. However, because this series is stationary at first differencing when a minimal number of initial periods are burned off, and because it is stationary at levels and second differencing, we proceed with the assumption of its validity for a VECM model. And we take pains to re-examine residuals and normality in the final model to ensure this assumption is valid. The two variables Mortgage Debt per GDP and Unemployment are the same for both and are repeated for completeness.

### 4.2 Granger causality

We confirm that VECM is appropriate by evaluating, Granger causality and cointegration. We define cointegration as done in New Introduction to multiple time series analysis which references the work of Granger (1981) and Engle and Granger (1987). That is, “Generally, the variables in a *K* -dimensional process *y*_{t} are called cointegrated of order (*d*, *b*), briefly, *y*_{t} ∼ *CI*(*d*, *b*), if all components of *y*_{t} are *I*(*d*) and there exists a linear combination *z*_{t}≔*βy*_{t} with *β* = (*β*_{1}, …, *β*_{K}) ≠ 0 such that *z*_{t} is *I*(*d* − *b*).”

To test for cointegration, we use the Johansen test which evaluates the presence of unit roots in an estimated cointegrating relationship. Table 3 shows the results of the test and indicates that there are two cointegrating relationships for the multifamily model. Table 4 shows three cointegrating relationships in the office model.

We confirm this cointegration through a Granger test of causality, which also specifies the direction of causality. We reject the null hypotheses that Mortgage debt as a percent of GDP, combined with Unemployment, do not Granger Cause cap rates (Tables 5 and 6). That is, forecasting of the two cap rate series is enhanced by viewing the history of the unemployment and debt variable series.

Thus our series represent a cointegrated process of order CI(1,1); our variables Granger cause our target variable, and a VECM model is well-suited.

### 4.3 Lag and rank

A VECM is parametrised by its lag order, its cointegration rank and the inclusion or exclusion of deterministic terms.

The lag order is set by evaluating the various measures of fit on a VAR specification, and subtracting one, because p-1 lagged differences in a VECM correspond to a VAR of *p*. We elect a lag of one-quarter for both series, opting for the most parsimonious model supported by the measures of fit (AIC, BIC, FPE and HQIC) as seen in Tables 7 and 8.

We maintain the lag of one throughout the in-sample and out-of-sample models we create, having tested each and determining the appropriateness in the same manner as outlined above.

The cointegration rank is defined as *rank*Π where *y*_{t} is *K*-dimensional, Π is a (*K* × *K*) matrix of rank *r*, 0 < *r* < *K*, *α* and *β* are (*K* × *r*) with rank *r*. Rank is usually specified based on the number of cointegrating relationships found. As we found one significant cointegrating relationship in each model, we opt for *rank*Π = 1.

Tables 9 and 10 shows the details of the first column of the respective *α* matrices (loading coefficients). These are the terms that deal with the equation for the cap rate series. Tables 11 and 12 show the details of the first column of the respective *β* matrices (cointegration relations) for the both office and multifamily cap rates. There are three columns of coefficients (one dealing with each variable with Mortgage debt and Unemployment not shown) which can be taken as columns in the full *α* and *β* matrices, such that *rankαβ* = 3 for both the multifamily and office specifications. This is confirmed by the number of cointegrating relationships tested in the Johansen test above.

### 4.4 Equations

A VECM with both a deterministic linear trend and intercept can be compactly generalised

*z*

_{t−1}can be calculated as an error correction term such that:

*α*and

*β*matrices such that:

*rank*Π of 1, and no external deterministic terms allows us to rewrite the general form to

Thus, we construct a VECM for both series (Multifamily and Office) of rank 1 (to represent the cointegrated relationships between the three input series) and using one lag term (using one-quarter of trailing data to build the model). The coefficients and their standard errors are detailed below (see Tables 13 and 14).

Note that the standard errors of the loading coefficients *β* are 1 for the variable beta.1 in both the office and multifamily models. This is the standard case when the cointegration relations are modelling the target variable. In this case, beta.1 refers to the office or multifamily cap rate variable, and the cointegration relation is based on its relationship to the other variables (beta.2, beta.3). The 1.0000 becomes a dummy variable of sorts and thus technically has no standard error. Often times the standard error is omitted, though we include it for completeness.

### 4.5 Residuals

To ensure the model is well-fit, we examine its coefficients, and test for autocorrelation in the residuals. Specifically, we test the whiteness of the residuals, using a Portmanteau test, to reject the presence of autocorrelation (see Tables 15 and 16).

## 5. Results

We use the above-specified model to forecast one through four-quarters ahead to analyse the in-sample explanatory performance of the model. We then construct a new model each period to forecast one through four-quarters ahead to analyse the out-of-sample forecast performance of the model. We compare the results of these out-of-sample forecasts to a baseline model to examine the value of the variable selection versus model selection.

### 5.1 In-sample

The goal of in-sample estimation is to gauge the variance that is explained by the model. Specifically, this is done by forecasting data that the model has already seen. The model is trained on all time periods and then asked to forecast those same time periods to test the strength of the VECM model at explaining the variance in cap rates. This is standard practice as illustrated by the literature review.

To accomplish this, we used the model on built on all 109 quarters of data– 1993 through 2020, as specified above. Using this model, we forecasted one-period ahead, for each of the quarters in the data.

The results produced are quite strong for both office and multifamily series.

The *R*-squared for multifamily cap rate forecasts is 0.96 with a mean absolute error of 35 basis points. The *R*-squared for office cap rate forecasts is 0.92 with a mean absolute error of 52 basis points. Both models exceed the *R*-squared of previous work (Lee *et al.*, 2014) estimating cap rates through the volatility surrounding the GFC (see Figures 1–5).

Other measures of accuracy–mean absolute error, bias, variance and percentage directional accuracy–support the strength of the models as shown in Tables 17 and 18.

We note the tendency of the model to overreact during periods of volatility, namely the GFC. While the accuracy is still robust overall, we propose that the model accurately represents what cap rates would have had wide price discovery actually prevailed during those periods of volatility. That is, had the same number of properties transacted during the GFC as before and after, we propose they would have done so at the cap rates forecasted by the model. The periods before and after the GFC saw a large random sample of properties transacting across the quality spectrum, while the GFC itself saw a very non-random subset properties transact, greatly reducing price-discovery. This leads to a seeming overreaction by the model's forecast, but in reality, we posit these forecasted cap rates represent where broad price discovery would have occurred. This proposed phenomenon would have been especially true in the institutional office space, which the series represents (see Tables 19 and 20).

When compared to the measures of fit of the baseline models, the novel methods do underperform slightly. In *R*-squared, the novel methods are 0.005 worse in multifamily and 0.011 worse in office. Thus, if one could know in advance what the dynamics of risk premiums and return expectations (the variables used in the baseline model) would be, one could construct a model of higher fit. However, practically, to take advantage of that higher fit would require predicting those variables, which introduces a further source of potential variance.

Returning to the novel models, we further note the 0.04-point weakness in the *R*-squared of the office model compared to the multifamily model. One possible reason for this is that the multifamily cap rate series are less volatile and less skewed than the office cap rates. And the fact that the office variance during the GFC was more pronounced (the office cap rates retraced to within 150 basis points of their all-time highs while the multifamily series came nowhere close to this) likely contribute to this. However, it is also possible that these periods of intense volatility, which are otherwise surrounded by periods of bounded, milder moves, highlight one of the shortcomings of a VECM model which has to be constructed ex-post. The somewhat inconsistent variance in the different periods of the series has led some to pursue models which account for this, namely GARCH.

We find that a change in the unemployment rate from 5% to 4% lowers cap rates by a negligible one to three basis points. Thus, even the 600-bp increase in the unemployment rated during the financial crisis only raised cap rates by 6–18 bps, and the inverse as unemployment fell. This is not really economically significant though it is statistically precise.

More importantly, we find that when mortgage debt grows 100 bps faster (slower) than GDP, cap rates fall (rise) by 22 and 65 basis points for multifamily and office properties respectively. If debt grows by 10%, relative to GDP, cap rates stand to compress by 220–650 bps. This is a dramatic impact. So we clearly find that an increase in mortgage debt as a percent of GDP drives down cap rates, and an increase in unemployment slightly drives up cap rates. And this stands to reason, as these two variables provide insight to the risk side and the demand side of pricing, through unemployment and mortgage debt, respectively.

Our model finds that a spike in unemployment is very weakly negative for real estate valuation in the short term, but in the longer-term, the view on rates has not changed, as the flow of funds itself has been stable the past five years, with all real estate mortgage debt at 75% of GDP.

Over the next year, we expect multifamily and office cap rates to decrease slightly. There are, of course, ways that this dynamic can be muted. Two that come to mind are surprise inflation and cloudy price discovery via forbearance. The former may cause an exodus from real estate into higher-yielding asset types, while the latter may unhinge pricing from supply and demand dynamics.

### 5.2 Out-of-sample

The goal of out-of-sample forecasting is to test the relevancy of the model and the inputs on forecasting data that it has not yet seen. We do this by splitting the data into so-called training and testing sets, and we split it at three different periods (45 quarters in, 75 quarters in and 105 quarters in).

We respecify models at each split, following the steps outlined in the Specification section above, altering the lag, cointegration rank, and retesting the granger (non) causality.

We construct the models without exposure to the future data. That is, the model created to forecast Q1 2010 has only seen data from Q4 2009 and before. The model created to forecast Q2 2010 has only seen data from Q1 2010 and prior, and so on.

Out-of-sample results are presented for forecasting 1-, 2-, 3- and 4-quarters ahead for each of the three-split periods, yielding 12 models for each of the multifamily cap rate series in Figures 6–8.

The same is shown for the office series in Figures 9–11.

And summary metrics for accuracy are presented compactly for the two sectors in Figures 12 and 13.

Over these periods, the multifamily cap rate forecast range from an *R*-squared of 0.31 (split at 105 quarters, forecasting four-quarters ahead) to 0.96 (split at 45 quarters, forecasting one-quarter ahead). Office cap rate forecasts range from an *R*-squared of 0.17 (split at 75 quarters, forecasting four-quarters ahead) to 0.90 (split at 45 quarters, forecasting one-quarter ahead). Mean absolute errors of the two models are consistently low–at most 17 basis points, and at least 2 basis points.

### 5.3 Comparison to baseline model

To indicate that the strength of our specification comes from the novel variable and not solely the use of the VECM method, we construct a baseline model for both the multifamily and office series.

The baseline model uses the same categories of variables found significant in An and Deng–variables common to the literature which are largely accepted to be primary determinants of cap rates. Specifically, we use a metric representing return expectations and another representing risk premium. We also use the same cap rate series themselves, as detailed in the Data section.

We input these series in our VECM model, and we respecify the model, in the same method detailed in the Specification section. We then use the model to forecast cap rates 1 through 4-periods ahead, using the same train and test data as we did for the out-of-sample series above.

Out-of-sample results are presented for forecasting 1-, 2-, 3- and 4-quarters ahead for each of the three-split periods, for 12 models for each of the multifamily cap rate series in Figures 14–16. The same is shown for the office series in Figures 17–19.

And summary metrics for accuracy are presented compactly for the two sectors in Figures 12 and 13.

The results achieved for this baseline model specification are in line with those achieved by An and Deng, suggesting that the method and variables established in their work remain relevant some ten years later.

The multifamily forecasts constructed using the funds flow data (funds-flow, unemployment, cap rates) outperform the baseline model (risk premium, return expectations, cap rates) in *R*-squared in 11 of 2 cases. But the baseline model seems to achieve stronger directional accuracy and slightly lower MAE in most cases.

The novel method achieves its largest outperformance in estimating three- and four-quarters ahead when trained on more data (split at 75 and 105 quarters in). In forecasting four-quarters forward, split at 105 quarters in, the novel method has an *R*-squared of 0.31 compared to the baseline model's 0.21. It underperforms slightly when forecasting four-quarters when trained on less data (split at 45 quarters in).

Overall, it offers a higher *R*-squared in almost all cases, especially recently, and especially in long-duration forecasts.

Conversely, the office forecasts constructed using the novel underperform the baseline model in *R*-squared 11 out of 12 cases. But the novel method meets or exceeds the baseline model in directional accuracy consistently. It does not appear to be a better estimator of cap rates than a baseline model.

Interestingly, however, it improves consistently with more data. Split at 45 quarters, the novel model outperforms the baseline in only one metric. Split at 75 quarters, the novel model outperforms the baseline in 3 of 12 metrics. Split at 105 quarters, the novel model outperforms the baseline in 6 of 12 metrics.

We posit the improvement in the novel method with more data due to the recent lack of reliability of traditional metrics (used in the baseline model). Relying on risk premium and expectations in an environment of compressed real yields may not be as telling as looking to the novel variable to gauge demand and opportunity cost.

We also note the novel variable is not an office variable–nor is it explicitly a multifamily variable. It is merely a debt variable, and while we made the assumption that the portion of the debt attributable to office was relatively stable over time, we realise that the components of the numerator could be shifting such that the variable is becoming more useful in forecasting office cap rates than it once was.

So we note the impressive ability to forecast cap rates when two out of three of the input variables are not explicitly concerned with the target series–funds flows and unemployment are not office variables. That the novel model at all outperforms the baseline model when the baseline model uses three office-specific variables (office risk premiums, office return expectations and office cap rates) is telling.

In summary, the multifamily outperformed categorically and the office model did not underperform categorically. These resaults again were completed without any reference to interest rates, return expectations or risk-premiums–variables traditionally thought to be integral to accurate cap rate description and forecasting.

### 5.4 Comparison to appraisal based cap rates

Finally, we use our out-of-sample forecast method on the traditional NCREIF appraisal index for all properties, weighted by market value.

Note, the NCRIEF index is the one favoured by most studies of cap rates, as it is documented, historical and national. However, appraisal-based indices differ substantially from transaction-based indices. Basing a value index on appraised prices leads to a series that is smoother and staler than a transactions based one (Haurin, 2005). The lag error component refers to the delay in assessing price changes (i.e. an appraiser will only know to adjust prices down after prices have begun to fall). The smoothness error comes from the same source: values are somewhat chained to their temporal peers. That is, an appraisal of a seemingly similar property will not deviate from its apparent peers. But in reality, its price might and does. Because the appraisal-based series is slower-moving and smoother than a transaction-based one, the former lends itself more to forecasting. The Green Street index, by contrast, is based on where properties are transacting in the marketplace at that point in time. This leads it to be more volatile and more timely. We compare the two cap rate series in Figure 20.

We have demonstrated the efficacy of forecasting transaction-based indices using our VECM and funds flow data, but for the sake of completeness and comparability to the literature, we show the strength of our specification in forecasting the appraisal based series.

Predictably, the results are quite strong. The *R*-squares is 0.95 Figure 21. This would establish a new high benchmark for accuracy out-of-sample cap rate forecasting, both because of the *R*-squared and the length of forecast.

### 5.5 Analysis of results

The results, while strong, are volatile. Though volatility is to be expected, as the transactional cap rate series– versus appraised values–is far more volatile than other series used in most research, i.e. NCRIEF's appraisal index. We have little context to gauge the goodness of fit of these models in a comparative setting, as little has been done in forecasting out-of-sample cap rates, and even less has been done to attempt forecasting transactional cap rates.

Our findings establish the importance of this new-to-the-literature specification of funds flows (mortgage debt as a percent of GDP) in explaining cap rates, especially multifamily cap rates. The variables, in combination with unemployment and the cap rates themselves, are powerful enough to explain cap rates in volatile environments and responsive enough to correct for disequilibria in the relationships. The relationships hold both in-sample and out-of-sample, and the multifamily series outperformed when compared to a model using risk premium and return expectations. The office series was not as consistent in its outperformance of the baseline series, but offered interesting strength in meeting and exceeding the baseline series in directional accuracy and had explanatory power that improved significantly over time.

In relation to other studies, we offer explanatory power from the demand side of the pricing equation. We offer additional validity to a sparsely studied area of cap rate explanation by supplying a variable to address the debt which finds placement in real estate purchases. While two other studies have examined versions of the debt variable, we introduce a specification which proves to be even more robust in explaining the variance in cap rates.

Another timely strength of this study is its demonstration that cap rate series can be forecasted with a high degree of accuracy without using the traditional variables of risk-premium and return-expectations. This is especially timely given the new low-rate environment. As US treasuries approach their lower bound, they will offer little in terms of forecasting value to calculate a risk premium or a discount rate for a return expectation. Given this, finding other relationships to cap rates is pressing. Our VECM specification and funds flows variable offer this alternative.

However, the study was remiss in not being able to explain cap rates further into the future beyond four-quarters. Both cap rates and mortgage fund flows seem slow moving, and the inability to generate strong forecasts beyond four-quarters proved a limitation of the specification.

It is reasonable to ask, what is the efficacy of a model whose strength is mostly in the one-to-four-quarter ahead forecast? We see two clear cases. One-year cap rate forecasts can aid fund managers who have optionality as to when to exit investments. This result also helps the individual buyers and sellers looking to purchase or sell. Such sellers have optionality as to time and would be served greatly by knowing which direction pricing would head in the next year.

Forecasting more than four-quarters ahead would require a view of unemployment levels and mortgage debt levels as a percentage of GDP. Interestingly, both seem to be forecastable. Research done even in the midst of the GFC (Fiorilla *et al.*, 2009) was able to forecast accurately when mortgage debt at a percent of GDP would return to historical levels. Given this may be a more stable variable and a chief input to our model, it may be possible to forecast cap rates further into the future than our specification currently allows.

Another valid question asks, what is the value of a model with only slightly better multifamily outperformance than that of the baseline model. We posit that the small-but-significant outperformance is impactful when considering the implications to policy and the impact of small moves in cap rates.

For example, multifamily cap rates move on average 11 basis points per quarter. Mortgage debt as a percent of GDP moves on average 8 basis points per quarter. The deltas are small enough that a model with consistently higher accuracy could allow policymakers enough time and confidence to adjust rates. That is, all that would be required to curb an overly-fast compression in cap rates of say 25 basis points per quarter would be an increase in mortgage rates such that demand went down by 68 basis points. This is feasible with a small rate adjustment, based on DeFusco and Paciorek (2017). According to their work, a 1% point increase in the rate on a 30-years fixed-rate mortgage reduces first mortgage demand between 2 and 3%. This implies that only a 27-basis point increase in rates would be needed to effect a reduction in demand enough to curb the overly-fast compression in rates outlined above. Because these series have such large implications from such small moves, marginal increases in accuracy are integral to implementing effective policy.

It could be argued that the increase in out-of-sample accuracy that our model achieves is not significant enough to overcome the variance created by the sensitivity of mortgage demand to mortgage rates. However, in this application, it is not the accuracy of the model in forecasting cap rates that is important but the model's ability to tie the demand for mortgages to cap rates themselves. While previously there was a known connection between mortgage rates and demand, our work adds the necessary connection between demand and cap rates, allowing policy makers the full set of relations needed to guide cap rates, especially in periods of extreme moves.

As monetary infusions spike, rates dive and equity valuations move upwards, there is value to having a model which suggests a single variable of focus. As of 4Q 2020 there is an increase in the amount of mortgage debt as a percent of GDP. Granted, this is in large part due to the compression of GDP rather than the expansion of mortgage debt, but the model as stated accounts for this. And while unemployment is certainly wide of normal, the net impacts of these factors suggest stable-to-decreasing cap rates for the near-term.

## 6. Conclusion

In this paper, we have introduced a more accurate method of explaining and forecasting out-of-sample multifamily cap rates through the use of a novel variable, defined as Mortgage Debt Outstanding divided by GDP, and without the use of the variables thought integral to cap rate forecasting (return expectations and risk premiums). We use the novel variable, in conjunction with unemployment and cap rates themselves to specify a VECM. Using this model, we are able to robustly forecast transaction (as opposed to appraisal) cap rates, out-of- sample.

We address the challenge of nonstationary time series data without losing the explanatory power available in the levels of the data themselves. To do this, we specify a VECM and establish the cointegration and Granger causality between the variables. We also establish that the variables, while not stationary at levels, are mostly *I*(1) and thus are better suited for a method that corrects for the errors.

Our work contributes to the literature in four ways. First, it statistically substantiates the importance of fund flows (mortgage debt as a percent of GDP), as discussed in proprietary work by Linneman, as a determining factor of cap rates. Second, we offer a new standard for out-of-sample *R*-squared accuracy in cap rate forecast for one-quarter forward, and we offer a benchmark for four-quarter forward forecasting, in contrast to the one-quarter forward forecasting of most existing models. Third, we produce these stronger forecasts without using any of the data thought integral to cap rate forecasts (return expectations and risk premiums), challenging the relative strength of these variables. Finally, we execute this analysis on a novel-to-the-research cap rate set, from Green Street, which more closely mirrors the volatility and true movement of real estate pricing, as opposed to smoother and slower-moving appraisal indices.

We note that new methods of cap rate valuation are necessary at present, given most existing methods focus on some interest rate component. Real return expectations and risk premium often use risk-free rates or government bond rates, but with interest rates holding near zero for the foreseeable future, this may become a less telling variable. That our model is able to equal and surpass the robustness of estimations using the novel variable, absent interest rate variables, makes it a prime alternative.

We suggest that governmental policy could consider the relationship between variables it controls (access to mortgage debt) in an effort to stabilise pricing and create target metrics in the space, as the Fed does with the supply of money and interest rates. All that would be required to curb an overly-fast compression (expansion) in cap rates of say 25 basis points per quarter would be an increase (decrease) in mortgage rates such that demand went down by 68 basis points. In February of 2021, an increase in 30-years mortgage rates from 2.92 to 2.96% was accompanied by a decreased purchase demand of 5%, showing that the scale of accuracy needed to effectively manage the demand is quite high. Thus our model, which offers a small-but-consistent improvement in accuracy in the multifamily space, could be highly necessary.

Future work could examine efficacy of these variables in forecasting cap rate series of other asset types. We suspect single family rental and manufactured homes would be well-fit by the same variables. Even commercial real estate sectors may be similarly estimated, given the efficacy in forecasting office cap rates. Additional work could be done to predict cap rates further in the future. The challenge therein comes from the lack of data and the models' predictive power at periods that exceed the lags–that is, forecasting 8-periods forward with lags of 4. Likely, researchers would want to aggregate the quarterly or monthly series into annual ones to reduce noise, but these cap rate series have only become robustly priced and actively traded in the relatively recent past.

## Figures

Results of augmented Dickey–Fuller test on multifamily cap rate model data, indicating *I*(1) stationarity

Variable | Differencing | p-value |
---|---|---|

Mortgage debt per GDP | At level | 1.0000 |

Mortgage debt per GDP | First difference | 0.0000 |

Mortgage debt per GDP | second difference | 0.9718 |

Multifamily cap rates | At level | 0.3835 |

Multifamily cap rates | First difference | 0.0108 |

Multifamily cap rates | second difference | 0.0000 |

Unemp | At level | 0.1677 |

Unemp | First difference | 0.0092 |

Unemp | second difference | 0.0000 |

Results of augmented Dickey–Fuller test on office cap rate model data, indicating *I*(1) stationarity

Variable | Differencing | p-value |
---|---|---|

Mortgage debt per GDP | At level | 1.0000 |

Mortgage debt per GDP | First difference | 0.0000 |

Mortgage debt per GDP | second difference | 0.9718 |

Office cap rates | At level | 0.0025 |

Office cap rates | First difference | 0.2853 |

Office cap rates | second difference | 0.0000 |

Unemp | At level | 0.1677 |

Unemp | First difference | 0.0092 |

Unemp | second difference | 0.0000 |

Results of Johansen cointegration test on multifamily data, indicating two cointegrating relationships; test performed with one lagged difference

Trace statistic | Critical value |
---|---|

33.8607 | 29.7961 |

9.5132 | 15.4943 |

1.7102 | 3.8415 |

Results of Johansen cointegration test on office data, indicating three cointegrating relationships; test performed with one lagged difference

Trace statistic | Critical value |
---|---|

42.5665 | 29.7961 |

14.7861 | 15.4943 |

1.4900 | 3.8415 |

Results of Granger (non) causality test on Multifamily data, indicating Mortgage Debt per GDP and unemployment together Granger Cause multifamily cap rates, as does each variable individually

Caused | Causing | Test statistic | Critical value | p-value |
---|---|---|---|---|

[“multifam cap rates”] | [“Funds Flows” “Unemployment”] | 2.3367 | 2.4003 | 0.0554 |

[“multifam cap rates”] | [“Funds Flows”] | 3.1263 | 3.0244 | 0.0452 |

[“multifam cap rates”] | [“Unemployment”] | 2.9097 | 3.0244 | 0.056 |

Results of Granger (non) causality test on office data, indicating Mortgage Debt per GDP Granger causes office cap rates. To a lesser significance, Mortgage Debt per GDP combined with Unemployment also Granger causes office cap rates

Caused | Causing | Test statistic | Critical value | p-value |
---|---|---|---|---|

[“office cap rates”] | [“Funds Flows” “Unemployment”] | 2.2084 | 2.4003 | 0.0679 |

[“office cap rates”] | [“Funds Flows”] | 3.9307 | 3.0244 | 0.0206 |

[“office cap rates”] | [“Unemployment”] | 1.3121 | 3.0244 | 0.2707 |

VAR order selection of the office model

Lag | AIC | BIC | FPE | HQIC |
---|---|---|---|---|

0 | −38.49 | −38.36 | 1.92E-17 | −38.52 |

1 | −41.65 | −41.13 | 8.66E-19 | −41.76 |

2 | −42.17^{*} | −41.25^{*} | 7.133e-19^{*} | −42.35^{*} |

**Note(s):** ^{*}Highlights the minimums

VAR order selection of the multifamily model

Lag | AIC | BIC | FPE | HQIC |
---|---|---|---|---|

0 | −41.75 | −41.62 | 7.36E-19 | −41.78 |

1 | −45.17 | −44.64^{*} | 2.580e-20^{*} | −45.27 |

2 | −45.18^{*} | −44.27 | 3.50E-20 | −45.37^{*} |

**Note(s):** ^{*}Highlights the minimums

Loading coefficients *α* for the equation multifamily cap rates

Variable | Coefficient | Standard err | z | P > |z| | (0.025 | 0.0975) |
---|---|---|---|---|---|---|

EC1 | 0.0149 | 0.005 | 2.873 | 0.004 | 0.005 | 0.025 |

Loading coefficients *α* for the equation office cap rates

Variable | Coefficient | Standard err | z | P > |z| | (0.025 | 0.0975) |
---|---|---|---|---|---|---|

EC1 | 0.0227 | 0.008 | 2.709 | 0.007 | 0.006 | 0.039 |

Cointegration relations for loading coefficients *β* for the equation multifamily cap rates

Variable | Coefficient | Standard err | z | P > |z| | (0.025 | 0.0975) |
---|---|---|---|---|---|---|

beta.1 | 1.0000 | 0 | 0 | 0.000 | 1.000 | 1.000 |

beta.2 | 4.47e-09 | 1.77e-09 | 2.529 | 0.011 | 1.01e-09 | 7.93e-09 |

beta.3 | −23.9010 | 3.901 | −6.127 | 0.000 | −31.547 | −16.255 |

Cointegration relations for loading coefficients *β* for the equation office cap rates

Variable | Coefficient | Standard err | z | P > |z| | (0.025 | 0.0975) |
---|---|---|---|---|---|---|

beta.1 | 1.0000 | 0 | 0 | 0.000 | 1.000 | 1.000 |

beta.2 | 3.941e-09 | 1.47e-09 | 2.681 | 0.007 | 1.06e-09 | 6.82e-09 |

beta.3 | −21.9545 | 3.219 | −6.820 | 0.000 | −28.264 | −15.645 |

Model coefficient Γ summary for the multifamily cap rate equation

Variable | Coefficient | Standard err | z | P > |z| | (0.025 | 0.0975) |
---|---|---|---|---|---|---|

L1.multifam cap rates | 0.2767 | 0.095 | 2.914 | 0.004 | 0.091 | 0.463 |

L1.ASTMA | −5.377e-09 | 1.77e-09 | −3.039 | 0.002 | −8.84e-09 | −1.91e-09 |

L1.Unemp | −0.3160 | 0.402 | −0.785 | 0.432 | −1.105 | 0.473 |

Model coefficient Γ summary for the office cap rate equation

Variable | Coefficient | Standard err | z | P > |z| | (0.025 | 0.0975) |
---|---|---|---|---|---|---|

L1.office cap rates | 0.3201 | 0.096 | 3.349 | 0.001 | 0.133 | 0.507 |

L1.ASTMA | −7.673e-09 | 2.78e-09 | −2.764 | 0.006 | −1.31e-08 | −2.23e-09 |

L1.Unemp | −0.3104 | 0.572 | −0.543 | 0.587 | −1.431 | 0.810 |

Results of the Portmanteau test for residual autocorrelation in the multifamily cap rate model

Lag | Test statistics | Critical value | p-value |
---|---|---|---|

1.0 | 14.98 | ||

2.0 | 27.8437 | 12.5916 | 0.0001 |

3.0 | 42.7056 | 24.9958 | 0.0002 |

4.0 | 57.7534 | 36.415 | 0.0001 |

5.0 | 65.8739 | 47.3999 | 0.0006 |

6.0 | 74.3201 | 58.124 | 0.0015 |

7.0 | 87.8618 | 68.6693 | 0.001 |

8.0 | 120.1432 | 79.0819 | 0.0 |

9.0 | 131.7953 | 89.3912 | 0.0 |

Results of the Portmanteau test for residual autocorrelation in the office cap rate model

Lag | Test statistics | Critical value | p-value |
---|---|---|---|

1.0 | 14.0482 | ||

2.0 | 19.9893 | 12.5916 | 0.0028 |

3.0 | 40.67 | 24.9958 | 0.0004 |

4.0 | 50.923 | 36.415 | 0.0011 |

5.0 | 57.9986 | 47.3999 | 0.0046 |

6.0 | 72.0905 | 58.124 | 0.0026 |

7.0 | 87.6906 | 68.6693 | 0.0011 |

8.0 | 112.4419 | 79.0819 | 0.0 |

9.0 | 128.427 | 89.3912 | 0.0 |

Accuracy metrics for in-sample multifamily cap rate forecasts using the novel method

R-squared | MAE | MSE | Bias | Var. | Directional acc. % |
---|---|---|---|---|---|

0.9635 | 0.0035 | 0.0 | −0.0172 | 0.0003 | 0.5619 |

Accuracy metrics for in-sample office cap rate forecasts using the novel method

R-squared | MAE | MSE | Bias | Variance | Directional acc. % |
---|---|---|---|---|---|

0.9222 | 0.0052 | 0.0 | −0.0175 | 0.0003 | 0.5238 |

Accuracy metrics for in-sample multifamily cap rate forecasts using a baseline method

R-squared | MAE | MSE | Bias | Var. | Directional acc. % |
---|---|---|---|---|---|

0.9683 | 0.0032 | 0.0 | −0.0175 | 0.0003 | 0.5714 |

Accuracy metrics for in-sample office cap rate forecasts using a baseline method

R-squared | MAE | MSE | Bias | Variance | Directional acc. % |
---|---|---|---|---|---|

0.933 | 0.0047 | 0.0 | −0.0174 | 0.0003 | 0.581 |

Descriptive statistics for the variables used in the novel model

Stat | Mortgage/GDP | Multif CR | Office CR | Unemp |
---|---|---|---|---|

Kurtosis | −0.826413 | −1.58258 | −1.507317 | 1.297362 |

Maximum | 1.019769 | 0.0954 | 0.096681 | 0.009801 |

Mean | 0.727435 | 0.072833 | 0.075495 | 0.003694 |

Minimum | 0.536059 | 0.046033 | 0.048301 | 0.001225 |

Skewness | 0.605767 | −0.288187 | −0.408771 | 1.35384 |

Variance | 0.019373 | 0.000291 | 0.000293 | 4e-06 |

Descriptive statistics for the variables used in the baseline model

Stat | Multif return exp. | Multif risk Prem | Multif CR | Office CR | Office return exp. | Office risk Prem |
---|---|---|---|---|---|---|

Kurtosis | −1.687408 | −0.338522 | −1.58258 | −1.507317 | −1.389747 | −0.218723 |

Maximum | 0.113 | 0.0487 | 0.0954 | 0.096681 | 0.111 | 0.058381 |

Mean | 0.087766 | 0.023963 | 0.072833 | 0.075495 | 0.08419 | 0.026622 |

Minimum | 0.059 | −0.0113 | 0.046033 | 0.048301 | 0.048 | −0.011754 |

Skewness | −0.008336 | −0.617147 | −0.288187 | −0.408771 | −0.344822 | −0.445488 |

Variance | 0.000345 | 0.00019 | 0.000291 | 0.000293 | 0.000395 | 0.000242 |

## References

Abraham, J. and Hendershott, P. (1994), Bubbles in Metropolitan Housing Markets, National Bureau of Economic Research.

Abu-Mostafa, Y.S. and Atiya, A.F. (1996), “Introduction to financial forecasting”, Applied Intelligence, Vol. 6 No. 3, pp. 205-213.

An, X. and Deng, Y. (2009), A Structural Model for Capitalization Rate, RERI.

Anoruo, E. and Braha, H. (2008), “Housing and stock market returns: an application of GARCH enhanced VECM”, The IUP Journal of Financial Economics, Vol. 32 No. 2, pp. 30-40.

Arsenault, M., Clayton, J. and Peng, L. (2013), “Mortgage fund flows, capital appreciation, and real estate cycles”, The Journal of Real Estate Finance and Economics, Vol. 47 No. 2, pp. 243-265.

Brooks, C. and Tsolacos, S. (2001), “Forecasting real estate returns using financial spreads”, Journal of Property Research, Vol. 18 No. 3, pp. 235-248.

Campbell, S.D., Davis, M.A., Gallin, J. and Martin, R.F. (2009), “What moves housing markets: a variance decomposition of the rent–price ratio”, Journal of Urban Economics, Vol. 66 No. 2, pp. 90-102.

Case, K.E. and Shiller, R.J. (1990), “Forecasting prices and excess returns in the housing market”, Real Estate Economics, Vol. 18 No. 3, pp. 253-273.

Chervachidze, S. and Wheaton, W. (2013), “What determined the great cap rate compression of 2000–2007, and the dramatic reversal during the 2008–2009 financial crisis?”, The Journal of Real Estate Finance and Economics, Vol. 46 No. 2, pp. 208-231.

Chervachidze, S., Costello, J. and Wheaton, W.C. (2009), “The secular and cyclic determinants of capitalization rates: the role of property fundamentals, macroeconomic factors, and ‘structural changes'”, The Journal of Portfolio Management, Vol. 35 No. 5, pp. 50-69.

Christopoulos, A.D., Barratt, J.G. and Ilut, D.C. (2019), Introducing Synthetic Cap Rate Indices for us Commercial Real Estate.

Clayton, J., Ling, D.C. and Naranjo, A. (2009), “Commercial real estate valuation: fundamentals versus investor sentiment”, The Journal of Real Estate Finance and Economics, Vol. 38 No. 1, pp. 5-37.

Clements, S., Tidwell, A. and Jin, C. (2017), “Futures markets and real estate public equity: connectivity of lumber futures and timber REITs”, Journal of Forest Economics, Vol. 28, pp. 70-79.

Dasgupta, V. and Knapp, A.W.N. (2008), “Forecasting office capitalization rates and risk Premia in emerging markets”, PhD thesis, Massachusetts Institute of Technology.

De Prado, M.L. (2018), “The 10 reasons most machine learning funds fail”, The Journal of Portfolio Management, Vol. 44 No. 6, pp. 120-133.

DeFusco, A.A. and Paciorek, A. (2017), “The interest rate elasticity of mortgage demand: evidence from bunching at the conforming loan limit”, American Economic Journal: Economic Policy, Vol. 9 No. 1, pp. 210-40.

Dua, P., Miller, S.M. and Smyth, D.J. (1999), “Using leading indicators to forecast us home sales in a Bayesian VAR framework”, The Journal of Real Estate Finance and Economics, Vol. 18 No. 2, pp. 191-205.

Engle, R.F. and Granger, C.W. (1987), “Co-integration and error correction: representation, estimation, and testing”, Econometrica: Journal of the Econometric Society, pp. 251-276.

Fiorilla, P., Hess, R. and Liang, Y. (2009), “Point of view: deleveraging the commercial real estate market”, Journal of Real Estate Portfolio Management, Vol. 15 No. 3, pp. 299-306.

Fratantoni, M. and Schuh, S. (2003), “Monetary policy, housing, and heterogeneous regional markets”, Journal of Money, Credit and Banking, pp. 557-589.

Gau, G.W. (1984), “Weak form tests of the efficiency of real estate investment markets”, The Financial Review, Vol. 19 No. 4, pp. 301-320.

Gau, G.W. (1985), “Public information and abnormal returns in real estate investment”, Real Estate Economics, Vol. 13 No. 1, pp. 15-31.

Geltner, D. and Mei, J. (1995), “The present value model with time-varying discount rates: implications for commercial property valuation and investment decisions”, The Journal of Real Estate Finance and Economics, Vol. 11 No. 2, pp. 119-135.

Ghysels, E., Plazzi, A., Valkanov, R. and Torous, W. (2013), “Forecasting real estate prices”, in Handbook of Economic Forecasting, pp. 509-580.

Gordon, M.J. (1962), The Investment, Financing, and Valuation of the Corporation, RD Irwin Homewood.

Granger, C.W. (1981), “Some properties of time series data and their use in econometric model specification”, Journal of Econometrics, Vol. 16 No. 1, pp. 121-130.

Haurin, D.R. (2005), Us Commercial Real Estate Indices: Transaction-Based and Constant-Liquidity Indices.

Hendershott, P.H. and MacGregor, B.D. (2005), “Investor rationality: evidence from UK property capitalization rates”, Real Estate Economics, Vol. 33 No. 2, pp. 299-322.

Henig, S., Tsolacos, S. and Nanda, A. (2019), “Which sentiment indicators matter? An analysis of the European commercial real estate market”, Journal of Real Estate Research.

Hoesli, M. and Oikarinen, E. (2012), “Are REITs real estate? Evidence from international sector level data”, Journal of International Money and Finance, Vol. 31 No. 7, pp. 1823-1850.

Huang, N.E., Wu, M.-L., Qu, W., Long, S.R. and Shen, S.S. (2003), “Applications of Hilbert–Huang transform to non-stationary financial time series analysis”, Applied Stochastic Models in Business and Industry, Vol. 19 No. 3, pp. 245-268.

Kok, N., Koponen, E.-L. and Martínez-Barbosa, C.A. (2017), “Big data in real estate? From manual appraisal to automated valuation”, The Journal of Portfolio Management, Vol. 43 No. 6, pp. 202-211.

Lee, H.S., Corgel, J. and Shin, S. (2014), “Estimating net operating income growth for modeling us apartment property capitalization rates”, Journal of Real Estate Portfolio Management, Vol. 20 No. 1, pp. 67-78.

Lin, T.-Y. (2019), “Predicting house prices with real-estate-related stocks”, Jing Ji Lun Wen Cong Kan, Vol. 47 No. 2, pp. 159-182.

Linneman, P. (1986), “An empirical test of the efficiency of the housing market”, Journal of Urban Economics, Vol. 20 No. 2, pp. 140-154.

Linneman, P. (2015), “What really drives cap rates?”, The Linneman Letter, Vol. 15 No. 2, pp. 35-40.

Liow, K.H. and Yang, H. (2005), “Long-term co-memories and short-run adjustment: securitized real estate and stock markets”, The Journal of Real Estate Finance and Economics, Vol. 31 No. 3, pp. 283-300.

Lütkepohl, H. (2005), New Introduction to Multiple Time Series Analysis, Springer Science and Business Media.

McAllister, P. and Nanda, A. (2016), “Do foreign buyers compress office real estate cap rates?”, Journal of Real Estate Research, Vol. 38 No. 4, pp. 569-594.

McGough, T. and Tsolacos, S. (2001), “Do yields reflect property market fundamentals”, in Real Estate and Finance Investment Research Paper.

Moreira, A.C., Tavares, F.O. and Pereira, E.T. (2016), Rental Income and Cap Rates a Comparison of the Lisbon and Porto Housing Markets.

NAREIT (2019), Estimating the Size of the Commercial Real Estate Market, Nareit Research.

Saad, L. (2020), What Percentage of Americans Owns Stock?.

Shorrocks, A., Davies, J. and Lluberas, R. (2018), Global Wealth Report 2018.

The CoStar Group (2020), “Office stock- United States - all quality”, Analytic Export, available at: https://product.costar.com/analyticexport/.

The St. Louis Federal Reserve Bank (2020a), “Gross domestic product”, StLouisFed.org, available at: https://fred.stlouisfed.org/series/GDP.

The St. Louis Federal Reserve Bank (2020b), “Mortgage debt outstanding, all holders”, StLouisFed.org, available at: https://fred.stlouisfed.org/series/MDOAH.

The St. Louis Federal Reserve Bank (2020c), Real Gross Domestic Product.

Tuluca, S.A., Myer, F.N. and Webb, J.R. (2000), “Dynamics of private and public real estate markets”, The Journal of Real Estate Finance and Economics, Vol. 21 No. 3, pp. 279-296.

Zhou, Z.G. (1997), “Forecasting sales and price for existing single-family homes: a VAR model with error correction”, Journal of Real Estate Research, Vol. 14 No. 2, pp. 155-167.

## Acknowledgements

*Funding:* This research is part of Matt Larriva’s role as Director of Research and Data Analytics at the Real Estate Private Equity firm FCP.

*Conflict of interest:* One author works for a Real Estate Private Equity firm which has ownership interest in many office and multifamily assets throughout the US.

*Availability of data and material:* Data available upon request.

*Code availability:* Code available upon request.

*Authors' contributions:* Each of the authors confirms that this manuscript has not been previously published and is not currently under consideration by any other journal.