Improved iterative methods for solving risk parity portfolio

Risk parity, also known as equal risk contribution, has recently gained increasing attention as a portfolio allocation method. However, solving portfolio weights must resort to numerical methods as the analytic solution is not available. This study improves two existing iterative methods: the cyclical coordinate descent (CCD) and Newton methods. We enhance the CCD method by simplifying the formulation using a correlation matrix and imposing an additional rescaling step. We also suggest an improved initial guess inspired by the CCD method for the Newton method. Numerical experiments show that the improved CCD method performs the best and is approximately three times faster than the original CCD method, saving more than 40% of the iterations.


Introduction
Optimal portfolio selection is an important question in academia and the financial industry. While there are several traditional allocation methods, such as mean-variance, minimum variance, and equally weighted (1/N ) portfolios, the risk parity (or equal risk contribution) model has recently gained popularity in the asset management industry. Under the risk parity model, the portfolio weights are selected in such a way as to equalize the contribution of each asset to the portfolio volatility. Like the 1/N portfolio, risk parity aims to diversify and overcome the concentration or sensitivity issues found in the Please cite as: Choi, J., & Chen, R. (2022). Improved iterative methods for solving risk parity portfolio. Journal of Derivatives and Quantitative Studies, 30(2). https://doi.org/10.1108/JDQS-12-2021-0031 mean-variance or minimum variance portfolios. However, risk parity is the 1/N portfolio in terms of risk allocation, rather than capital allocation.
While it is unclear who started using the risk parity strategy in the financial industry, Maillard et al. (2010) and Qian (2011) are widely cited as references that have introduced this strategy to academia.
There is a growing body of literature on various aspects of the risk parity allocation method. For example, see Chaves et al. (2011) and Clarke et al. (2013) for a comparison of risk parity with other asset allocation methods. Kim et al. (2020) study the risk parity model where the covariance is estimated from the XGBoost algorithm. Kim and Kim (2021) studies the risk parity under the covariance estimation error. The actual performance of funds implementing this strategy is controversial. See Corkery et al. (2013) for more details.
Among the academic research topics related to the risk parity model, this study is primarily concerned with numerical methods to solve risk parity portfolio weights. Solving the portfolio weight for a given return covariance matrix is challenging because an analytical solution is not available. One must resort to numerical methods, and several methods are currently available. Maillard et al. (2010) formulate the risk parity problem using a sequential quadratic programming (SQP) method. Chaves et al. (2012) uses the Newton method to solve the multidimensional root, taking advantage of the analytical Jacobian matrix.
Although the original Newton method cannot guarantee the weights to be positive, Spinu (2013) resolve the issue by introducing damped iteration steps. A competing method is the cyclical coordinate descent (CCD) algorithm (Griveau-Billion et al., 2013). Because the CCD method uses quadratic iteration steps, it does not rely on a Jacobian matrix. Bai et al. (2016) propose the alternating linearization methods (ALMs) for solving the risk parity weight in a generalized setting.
According to the literature, the performances of the CCD and Newton methods are comparable. Griveau-Billion et al. (2013) claim that the CCD method outperforms the Newton method when the number of assets is larger than 250. Bouzida (2014) reports that, while the CCD method is faster than Spinu (2013)'s Newton method, it lacks the robustness for a pool of assets.
This study reviews and improves two algorithms for solving the risk parity portfolio allocation: the CCD and Newton methods. We improve the CCD method by simplifying it to a numerically efficient form.
Furthermore, we improve the Newton method by suggesting a new initial guess. Numerical experiments with randomly generated covariance matrices show that our improved CCD method outperforms the other methods in terms of speed and stability for a wide range of portfolio sizes.
The remainder of this paper is organized as follows. Section 2 introduces the risk parity portfolio and its properties. Section 3 presents the improved root-finding method, and Section 4 demonstrates the computational gain of the new methods using numerical experiments. Finally, Section 5 concludes this paper.

Notations and conventions
We define several notations and operations regarding vectors and matrices to be used in the remainder of this study.
• For vector x (in boldface), the i-th element is denoted by x i or (x) j .
• For matrix A (in boldface), the (i, j) element is denoted by A ij or (A) ij .
• Vectors are assumed to be column vectors unless otherwise specified.
• 1 N is the N × 1 column vector filled with 1's. I N is an N × N identity matrix.
• The operations, * and /, between vectors x and y are defined to be the element-wise multiplication and division, respectively: x * y = (x 1 y 1 , · · · , x N y N ) T and y/x = (y 1 /x 1 , · · · , y N /x N ) T • The operation * between a (column) vector x and matrix A is defined as

Condition for risk parity portfolio
Let σ and C be the standard deviation vector and covariance matrix, respectively, of the return on N assets in a unit time period. The covariance C is symmetric and positive semi-definite, and its diagonal elements are related to σ by C ii = σ 2 i . The portfolio of N assets invested with weight w has return volatility: From Euler's homogeneous function theorem, volatility V (w) can be decomposed into the sum of the contributions of each asset as Let b i , such that i b i = 1 and b i > 0, be the relative contributions of the i-th asset to portfolio volatility. Then, we aim to determine the weight w that satisfies The risk parity portfolio is a special case of the problem above, where the risk contributions are equally divided among the assets: Although we deal exclusively with the risk parity case, we use b i to avoid losing generality. Therefore, the risk parity portfolio weight must satisfy the condition Here, we impose w i ≥ 0 because we are concerned with a long-only portfolio. We refer to Bai et al. (2016) for the risk parity with an unconstrained portfolio.
The solution is not unique because of the homogeneous property of the condition. If w is a solution, then µw for µ > 0 is also a solution. The degrees of freedom can be fixed by setting k w k = 1. The normalized weightw is obtained byw

Risk parity condition with correlation matrix
The condition for the risk parity portfolio can be equivalently stated in terms of the correlation matrix (Spinu, 2013). Let R be the correlation matrix whose element is computed from the covariance matrix C: Then, the risk parity condition with the correlation matrix is given by If w is the solution to the correlation condition in Eq. (3), then w/σ is the solution to the covariance condition in Eq. (1).
Moreover, Spinu (2013) removes the degree of freedom in w by taking advantage of the fact that w T Rw is scalar, although the value is unknown. Without loss of generality, by setting we arrive at the simple risk parity condition: Once we find the unique root w in Eq. (5), we can obtain the normalized portfolio weightw bȳ Note that Eq. (4) is consistent with Eq. (5) because it can be obtained by summing Eq. (5) for all i, We will use both Eqs. (4) and (5) in Section 3.1 to improve the original CCD method.

A special case solution and initial guess for iterative methods
A general solution to the risk parity portfolio is not analytically available. However, an analytical solution exists for a special case in the correlation matrix. When the correlation matrix has the same row sum, j R ij = r for all i and some constant r, Note that a special case is typically achieved when the assets have a uniform correlation, that is, R ij = ρ for i = j and some ρ. This is also the case for N = 2; thus, w = 1 N is the general solution for allocating the two assets. In terms of the simple condition, Eq. (5), w i = 1/ √ N r is the corresponding solution for the special case. The actual portfolio weights satisfying the original condition in Eq.
(1) are given by weights that are inversely proportional to the standard deviation:w Even in general cases where the row sums of R are different, the special case solution serves as a first-order approximation. Chaves et al. (2012) and Griveau-Billion et al. (2013) use Eq. (7) as the initial guess for the iterative methods. Spinu (2013, Theorem 3.4) further improves the initial guess for the simple condition, w i = 1/ √ N r, to

Methods for solving risk parity
This section reviews and improves the CCD and Newton methods. As stated in Section 1, there exist other methods, such as the SQP algorithm (Maillard et al., 2010) and ALM (Bai et al., 2016). Although such methods may handle risk parity in a more generalized setting, they have been reported to be slower than the CCD and Newton methods in handling the standard long-only risk parity model. Therefore, we focus on these two methods.

The improved CCD method
The original CCD method (Griveau-Billion et al., 2013) aims to solve Note that this condition is different from the covariance condition in Eq.
(1) as V 2 (w) is replaced with V (w). Although the intention is not explicitly stated in the reference, the purpose of the replacement seems to break the homogeneous property of w; if w is a root, µw for µ > 0 is no longer a solution except for µ = 1. Therefore, CCD iterations eventually lead to a unique root.
The equation for the i-th component can be written in the quadratic form of w i : and we use the root formula as an iteration step for w i (Griveau-Billion et al., 2013, Eq. (4)) We select the positive one among the two roots of the quadratic equation to ensure that w i ≥ 0. In the CCD method, one iteration involves cyclically updating w i for i = 1, · · · , N . Updating w i , therefore, uses w j for 1 ≤ j < i, which was previously updated in the same iteration step. Therefore, the CCD method is known to be more effective than batch coordinate descent. Griveau-Billion et al. (2013) starts the iterations with the initial guess in Eq. (7).
We improve the original CCD method in two ways. First, we formulate the CCD with the simple condition in Eq. (5). The equation for the i-th component in Eq. (5) can be written in the simpler quadratic form of w i : The corresponding iteration step is simplified to This new iteration has two advantages because V (w) disappears in the new CCD method. One obvious advantage is the reduced computation time for V (w). The calculation can be time consuming in the original CCD method because V (w) requires O(N 2 ) operations and must be updated when w i is updated.
The other advantage is not obvious but is more important. In the original CCD, the new w i on the left-hand side depends on the old w i through V (w) on the right-hand side of Eq. (10). The updated w i is not the true root of Eq. (10). However, in the new CCD method, the new w i is the exact root of Eq. (11) because the old w i does not appear on the right-hand side. Therefore, we expect the convergence to be faster in the improved CCD iteration.
Second, we rescale w by at the end of each iteration to ensure Eq. (4). This rescaling step is expected to make the convergence faster by adjusting w on average. The new CCD method uses a generalized initial guess, Eq. (8), from Spinu (2013). In fact, this can be understood as the result of rescaling from an equal weight w = 1 N .
Finally, we summarize the improved CCD algorithm in Algorithm 1.

Algorithm 1 The improved CCD algorithm
Input: covariance matrix C and error tolerance ε Calculate R and σ Chaves et al. (2012) and Spinu (2013) use the multidimensional Newton method to find the risk parity portfolio. From Eq. (5), the objective function is set as

Newton method with an improved initial guess
Then, the root of F (w) = 0 is the risk parity weight. The Jacobian of F (w) is readily available as follows: where δ ij is the Kronecker delta. Therefore, the iteration under the Newton method is given by However, unlike the CCD method, the Newton method iteration cannot guarantee that the converged weights are positive. Spinu (2013) overcomes the problem with the damped Newton method, where η ≤ 1 is a function of w and ∆w. Although we do not discuss the exact procedure, the basic idea is to use η < 1 in the early stage when w is away from the solution to ensure w i > 0 and η = 1 later when w is close enough to the solution for faster convergence. Spinu (2013) use Eq. (8) for the initial estimation of the Newton method.
Our enhancement of the Newton method is based on the initial estimation. Using an initial guess closer to the solution, we aim to use η = 1 throughout the iteration, without converging to a negative weight. If this can be achieved, one can use generic Newton method routines available in many numerical analysis packages that are highly optimized for the system.
We improve the original initial guess, Eq. (8), by updating it through the one-step CCD iteration, Eq. (11). However, instead of a slow cyclical update, we use the batch update, where the old w i values are used on the right-hand side: This new initial guess is more efficiently computed in the vectorized form, The numerical experiments in the next section demonstrate that our new initial guess is effective such that the method converges to a positive weight for almost all randomly generated test cases.

Numerical experiment
We tested the numerical performance of the improved algorithms. 2 For comparison, we implemented the following three algorithms: • The original CCD method, Eq (10) • The improved CCD method, Algorithm 1 • The Newton method, Eq (13), with the improved initial guess, Eq (14) 2 The tests are performed in Python on a computer running the Windows 10 operating system with an Intel Core i5-6500 (3.2 GHz) CPU.
All methods were implemented in Python. For the Newton method, we used the generic root solver scipy.optimize.root function 3 in the Python SciPy package. For all methods, we consistently use the error tolerance ε = 10 −6 .
In the test, we solved the risk parity portfolio for the correlation matrices randomly generated with the scipy.stats.random correlation class 4 using the Python SciPy package. The routine takes nonnegative eigenvalues as inputs. Subsequently, it uses the algorithm of Davies and Higham (2000) to generate a random correlation matrix. We used two methods to generate eigenvalues to test both positive definite and positive semi-definite correlation matrices: • Test 1: all eigenvalues sampled from independent uniform random variables between 0 and 1.
• Test 2: 80% of eigenvalues sampled from independent uniform random variables between 0 and 1, and 20% set to zero.
While prior studies (Griveau-Billion et al., 2013;Bai et al., 2016) typically test only positive definite case (i.e., Test 1), we think that the positive semi-definite case (i.e., Test 2) is also important because it is often encountered in practice when the covariance is estimated from a time series. When the covariance between N assets is estimated from M time periods with M < N , the estimated covariance matrix has at most M strictly positive eigenvalues. Although the Newton method is marginally faster than the improved CCD method for N ≤ 150, the required computation time is very short anyway for those small N .
Second, our improved CCD method is three to four times faster than the original CCD, saving approximately 6.5 iterations on average. Although not reported in the figure, we also ran the improved CCD method without the rescaling step in Eq. (12) to measure the gain from rescaling. The rescaling step saves about 1.5 iterations on average. Third, the Newton method successfully converges to positive weights in all cases. This confirms that the damped Newton method is unnecessary with an improved initial guess, Eq. (14). Nevertheless, the Newton method is inferior to the improved CCD method. The log-log plot clearly shows that the computation time scales as O(N 3 ) for the Newton method, but scales as O(N ) for both CCD methods.
The O(N 3 ) scale of the Newton method appears to be related to the computation of the Jacobian inversion, [∇F (w)] −1 . In the optimized multidimensional Newton method, inversion is not computed in every iteration. In our test, inversion is typically computed only once, that is, at the initial guess.
However, Jacobian inversion still slows the Newton method for a large N . Number of iterations Improved CCD Orignal CCD Newton Figure 2 shows the results for Test 2. While the relative performance between the three methods is similar, the overall computation becomes slower than Test 1, requiring more iterations. This indicates the difficulty of solving the risk parity portfolio against the positive semi-definite covariance matrices.
Moreover, the Newton method shows instability. It fails in convergence for two cases and converges to negative weights for 13 cases. Conversely, the CCD methods stably converge to positive weights for all cases.
We believe from the numerical tests that our improved CCD method is a fast and stable method for solving risk parity weights.

Conclusion
With the growing popularity of the risk parity model, several numerical methods have been proposed to solve portfolio allocation. We present improvements to two existing methods based on iterations: the CCD and Newton methods. Numerical experiments show that the improved CCD method performs the best in terms of speed and stability.