Learning to Hedge Swaptions
Zaniar Ahmadi, Frédéric Godin
TL;DR
This work develops a deep hedging framework for swaption hedging using reinforcement learning under a three-factor arbitrage-free Nelson–Siegel yield-curve model. It compares RL policies trained with mean-squared error, downside risk, and CVaR objectives against rho-hedging benchmarks, finding that two hedges are typically sufficient for near-optimal hedging and that RL approaches provide resilience to model misspecification. The study introduces a practical, constrained hedging architecture, leverages precomputed swaption pricing via a Kolmogorov–Arnold network, and analyzes factor exposures and feature importance with SHAP. Collectively, the results demonstrate that RL-based dynamic hedging can deliver more efficient, tail-sensitive, and cost-aware hedging strategies for swaptions in realistic market settings.
Abstract
This paper investigates the deep hedging framework, based on reinforcement learning (RL), for the dynamic hedging of swaptions, contrasting its performance with traditional sensitivity-based rho-hedging. We design agents under three distinct objective functions (mean squared error, downside risk, and Conditional Value-at-Risk) to capture alternative risk preferences and evaluate how these objectives shape hedging styles. Relying on a three-factor arbitrage-free dynamic Nelson-Siegel model for our simulation experiments, our findings show that near-optimal hedging effectiveness is achieved when using two swaps as hedging instruments. Deep hedging strategies dynamically adapt the hedging portfolio's exposure to risk factors across states of the market. In our experiments, their out-performance over rho-hedging strategies persists even in the presence some of model misspecification. These results highlight RL's potential to deliver more efficient and resilient swaption hedging strategies.
