Table of Contents
Fetching ...

Intertemporal Hedging Demand under Epstein-Zin Preferences in a Multi-Asset Long-Run Risk Model: Evidence from Projected Pontryagin-Guided Deep Policy Optimization

Wonchan Cho

TL;DR

The paper tackles intertemporal hedging in a high-dimensional, continuous-time portfolio problem with Epstein--Zin recursive preferences and a persistent long-run risk factor. It develops a projected Pontryagin-guided deep policy optimization (P-PGDPO) method that represents the value and costate processes with neural networks and updates the policy along the Hamiltonian gradient while enforcing feasibility via explicit projections. Empirically, the EZ-informed policy exhibits strong state-dependent hedging, concentrates exposure in assets tied to the LRR factor, and imposes wealth-floor constraints that substantially damp hedging near the boundary; CRRA serves as a stable diagnostic benchmark. Overall, the approach demonstrates that a transparent, PMP-based deep learning framework can yield economically interpretable hedging patterns in realistic multi-asset, long-run risk settings, with wealth constraints shaping the feasible hedging space.

Abstract

I study intertemporal hedging demand in a continuous-time multi-asset long-run risk (LRR) model under Epstein--Zin (EZ) recursive preferences. The investor trades a risk-free asset and several risky assets whose drifts and volatilities depend on an Ornstein--Uhlenbeck type LRR factor. Preferences are described by EZ utility with risk aversion $R$, elasticity of intertemporal substitution $ψ$, and discount rate $δ$, so that the standard time-additive CRRA case appears as a limiting benchmark. To handle the high-dimensional consumption--investment problem, I use a projected Pontryagin-guided deep policy optimization (P-PGDPO) scheme adapted to EZ preferences. The method starts from the continuous-time Hamiltonian implied by the Pontryagin maximum principle, represents the value and costate processes with neural networks, and updates the policy along the Hamiltonian gradient. Portfolio constraints and a lower bound on wealth are enforced by explicit projection operators rather than by adding ad hoc penalties. Three main findings emerge from numerical experiments in a five-asset LRR economy: \textbf{(1)} the P-PGDPO algorithm achieves stable convergence across multiple random seeds, validating its reliability for solving high-dimensional EZ problems; \textbf{(2)} wealth floors materially reduce hedging demand by limiting the investor's ability to exploit intertemporal risk-return tradeoffs; and \textbf{(3)} the learned hedging portfolios concentrate exposure in assets with high correlation to the LRR factor, confirming that EZ agents actively hedge long-run uncertainty rather than merely following myopic rules. Because EZ preferences nest time-additive CRRA in the limit $ψ\to 1/R$, I use CRRA as an explicit diagnostic benchmark and, when needed, a warm start to stabilize training in high dimensions.

Intertemporal Hedging Demand under Epstein-Zin Preferences in a Multi-Asset Long-Run Risk Model: Evidence from Projected Pontryagin-Guided Deep Policy Optimization

TL;DR

The paper tackles intertemporal hedging in a high-dimensional, continuous-time portfolio problem with Epstein--Zin recursive preferences and a persistent long-run risk factor. It develops a projected Pontryagin-guided deep policy optimization (P-PGDPO) method that represents the value and costate processes with neural networks and updates the policy along the Hamiltonian gradient while enforcing feasibility via explicit projections. Empirically, the EZ-informed policy exhibits strong state-dependent hedging, concentrates exposure in assets tied to the LRR factor, and imposes wealth-floor constraints that substantially damp hedging near the boundary; CRRA serves as a stable diagnostic benchmark. Overall, the approach demonstrates that a transparent, PMP-based deep learning framework can yield economically interpretable hedging patterns in realistic multi-asset, long-run risk settings, with wealth constraints shaping the feasible hedging space.

Abstract

I study intertemporal hedging demand in a continuous-time multi-asset long-run risk (LRR) model under Epstein--Zin (EZ) recursive preferences. The investor trades a risk-free asset and several risky assets whose drifts and volatilities depend on an Ornstein--Uhlenbeck type LRR factor. Preferences are described by EZ utility with risk aversion , elasticity of intertemporal substitution , and discount rate , so that the standard time-additive CRRA case appears as a limiting benchmark. To handle the high-dimensional consumption--investment problem, I use a projected Pontryagin-guided deep policy optimization (P-PGDPO) scheme adapted to EZ preferences. The method starts from the continuous-time Hamiltonian implied by the Pontryagin maximum principle, represents the value and costate processes with neural networks, and updates the policy along the Hamiltonian gradient. Portfolio constraints and a lower bound on wealth are enforced by explicit projection operators rather than by adding ad hoc penalties. Three main findings emerge from numerical experiments in a five-asset LRR economy: \textbf{(1)} the P-PGDPO algorithm achieves stable convergence across multiple random seeds, validating its reliability for solving high-dimensional EZ problems; \textbf{(2)} wealth floors materially reduce hedging demand by limiting the investor's ability to exploit intertemporal risk-return tradeoffs; and \textbf{(3)} the learned hedging portfolios concentrate exposure in assets with high correlation to the LRR factor, confirming that EZ agents actively hedge long-run uncertainty rather than merely following myopic rules. Because EZ preferences nest time-additive CRRA in the limit , I use CRRA as an explicit diagnostic benchmark and, when needed, a warm start to stabilize training in high dimensions.

Paper Structure

This paper contains 64 sections, 1 theorem, 76 equations, 8 figures, 6 tables, 1 algorithm.

Key Result

Proposition A.1

Let $R>0$, $R\neq 1$ be fixed and let $\psi>0$, $\psi\neq 1$. Set $S = 1/\psi$ and $\theta = (1-R)/(1-S)$ and consider the Epstein--Zin aggregator Suppose $v$ has the same sign as $1-R$ and define the CRRA utility $u(c) = c^{1-R}/(1-R)$. Then, for fixed $(c,v)$ in the admissible domain,

Figures (8)

  • Figure 1: Mean wealth over time under the PG--DPO full EZ policy and the analytic myopic benchmark. Both policies start from the same initial wealth. The EZ policy tracks the myopic benchmark closely while respecting the wealth constraint and embedding intertemporal hedging motives.
  • Figure 2: Portfolio and intertemporal hedging surfaces for Asset 1. The left panel shows the full PG--DPO EZ portfolio weight as a function of wealth $W$ and the long--run risk state $Y$. The middle panel shows the analytic myopic portfolio. The right panel plots the implied hedging component $\pi_{\text{hedge}} = \pi_{\text{EZ}} - \pi_{\text{myo}}$, which varies nonlinearly with both $W$ and $Y$.
  • Figure 3: Average absolute intertemporal hedging demand by asset, sorted by magnitude. Assets with stronger exposure to the long--run risk factor (higher $\beta_{\mathrm{LRR}}$ and correlation with $Y$) tend to carry a larger share of the hedging portfolio.
  • Figure 4: Cross-sectional relationship between average absolute hedging demand and asset characteristics. The four panels plot hedging demand against the Sharpe ratio, correlation with the long--run risk factor $\rho(R,Y)$, long--run risk beta $\beta_{\mathrm{LRR}}$, and volatility $\sigma$. Hedging demand is stronger for assets with higher long--run risk exposure and weaker for high--Sharpe, high--volatility assets with limited LRR exposure.
  • Figure 5: Portfolio, myopic benchmark, and hedging surfaces for Asset 2, analogous to Figure \ref{['fig:asset1-surfaces']}.
  • ...and 3 more figures

Theorems & Definitions (2)

  • Proposition A.1
  • proof : Proof (sketch)