Hedging Beyond the Mean: A Distributional Reinforcement Learning Perspective for Hedging Portfolios with Structured Products

Anil Sharma; Freeman Chen; Jaesun Noh; Julio DeJesus; Mario Schlener

Hedging Beyond the Mean: A Distributional Reinforcement Learning Perspective for Hedging Portfolios with Structured Products

Anil Sharma, Freeman Chen, Jaesun Noh, Julio DeJesus, Mario Schlener

TL;DR

This work tackles hedging of structured products, specifically Autocallable notes, with a distributional reinforcement learning framework. By modeling the full distribution of hedging returns using Distributed Distributional DDPG (D4PG) with Quantile Regression, the authors learn a hedging policy that adapts to long horizons, barriers, and coupon payments. They compare against traditional Delta-neutral and Delta-Gamma hedges, showing that the distributional approach reduces tail risk metrics ($5\%\ VaR$, $95\ VaR$, and $CVaR$) and yields a more symmetric, positively skewed $PnL$ distribution, especially when the objective is tailored to capture tail risk using a mixed $5\%$ and $95\%$ VaR. The study demonstrates the potential of distributional RL to improve risk management for structured financial products under transaction costs and complex payoff structures.

Abstract

Research in quantitative finance has demonstrated that reinforcement learning (RL) methods have delivered promising outcomes in the context of hedging financial portfolios. For example, hedging a portfolio of European options using RL achieves better $PnL$ distribution than the trading hedging strategies like Delta neutral and Delta-Gamma neutral [Cao et. al. 2020]. There is great attention given to the hedging of vanilla options, however, very little is mentioned on hedging a portfolio of structured products such as Autocallable notes. Hedging structured products is much more complex and the traditional RL approaches tend to fail in this context due to the underlying complexity of these products. These are more complicated due to presence of several barriers and coupon payments, and having a longer maturity date (from $7$ years to a decade), etc. In this direction, we propose a distributional RL based method to hedge a portfolio containing an Autocallable structured note. We will demonstrate our RL hedging strategy using American and Digital options as hedging instruments. Through several empirical analysis, we will show that distributional RL provides better $PnL$ distribution than traditional approaches and learns a better policy depicting lower value-at-risk ($VaR$) and conditional value-at-risk ($CVaR$), showcasing the potential for enhanced risk management.

Hedging Beyond the Mean: A Distributional Reinforcement Learning Perspective for Hedging Portfolios with Structured Products

TL;DR

, and

) and yields a more symmetric, positively skewed

distribution, especially when the objective is tailored to capture tail risk using a mixed

and

VaR. The study demonstrates the potential of distributional RL to improve risk management for structured financial products under transaction costs and complex payoff structures.

Abstract

distribution than the trading hedging strategies like Delta neutral and Delta-Gamma neutral [Cao et. al. 2020]. There is great attention given to the hedging of vanilla options, however, very little is mentioned on hedging a portfolio of structured products such as Autocallable notes. Hedging structured products is much more complex and the traditional RL approaches tend to fail in this context due to the underlying complexity of these products. These are more complicated due to presence of several barriers and coupon payments, and having a longer maturity date (from

years to a decade), etc. In this direction, we propose a distributional RL based method to hedge a portfolio containing an Autocallable structured note. We will demonstrate our RL hedging strategy using American and Digital options as hedging instruments. Through several empirical analysis, we will show that distributional RL provides better

distribution than traditional approaches and learns a better policy depicting lower value-at-risk (

) and conditional value-at-risk (

), showcasing the potential for enhanced risk management.

Paper Structure (13 sections, 3 equations, 8 figures, 3 tables)

This paper contains 13 sections, 3 equations, 8 figures, 3 tables.

Introduction
Related Works
Problem Formulation
The Asset and Option Pricing
Problem Formulation as an MDP
Proposed Method
Classical Reinforcement Learning
Distributional Reinforcement Learning
Distributed Distributional DDPG (D4PG)
Experiments and Results
Experimental Setup
RL Agent for Hedging
Conclusion

Figures (8)

Figure 1: The $PnL$ distribution when no hedging is performed.
Figure 2: The simulated stock path and the Autocallable note price being called early after $2.5$ years.
Figure 3: Figures shows the value, $Delta$ and $Gamma$ of Autocallable notes at different underlying index price 60 days, 5 days and 1 day before the first call date.
Figure 4: Model architecture of Distributed Distributional DDPG (D4PG) with Quantile Regression (QR).
Figure 5: Approaches to estimate distributions with DQN.
...and 3 more figures

Hedging Beyond the Mean: A Distributional Reinforcement Learning Perspective for Hedging Portfolios with Structured Products

TL;DR

Abstract

Hedging Beyond the Mean: A Distributional Reinforcement Learning Perspective for Hedging Portfolios with Structured Products

Authors

TL;DR

Abstract

Table of Contents

Figures (8)