Table of Contents
Fetching ...

Learning to Hedge Swaptions

Zaniar Ahmadi, Frédéric Godin

TL;DR

This work develops a deep hedging framework for swaption hedging using reinforcement learning under a three-factor arbitrage-free Nelson–Siegel yield-curve model. It compares RL policies trained with mean-squared error, downside risk, and CVaR objectives against rho-hedging benchmarks, finding that two hedges are typically sufficient for near-optimal hedging and that RL approaches provide resilience to model misspecification. The study introduces a practical, constrained hedging architecture, leverages precomputed swaption pricing via a Kolmogorov–Arnold network, and analyzes factor exposures and feature importance with SHAP. Collectively, the results demonstrate that RL-based dynamic hedging can deliver more efficient, tail-sensitive, and cost-aware hedging strategies for swaptions in realistic market settings.

Abstract

This paper investigates the deep hedging framework, based on reinforcement learning (RL), for the dynamic hedging of swaptions, contrasting its performance with traditional sensitivity-based rho-hedging. We design agents under three distinct objective functions (mean squared error, downside risk, and Conditional Value-at-Risk) to capture alternative risk preferences and evaluate how these objectives shape hedging styles. Relying on a three-factor arbitrage-free dynamic Nelson-Siegel model for our simulation experiments, our findings show that near-optimal hedging effectiveness is achieved when using two swaps as hedging instruments. Deep hedging strategies dynamically adapt the hedging portfolio's exposure to risk factors across states of the market. In our experiments, their out-performance over rho-hedging strategies persists even in the presence some of model misspecification. These results highlight RL's potential to deliver more efficient and resilient swaption hedging strategies.

Learning to Hedge Swaptions

TL;DR

This work develops a deep hedging framework for swaption hedging using reinforcement learning under a three-factor arbitrage-free Nelson–Siegel yield-curve model. It compares RL policies trained with mean-squared error, downside risk, and CVaR objectives against rho-hedging benchmarks, finding that two hedges are typically sufficient for near-optimal hedging and that RL approaches provide resilience to model misspecification. The study introduces a practical, constrained hedging architecture, leverages precomputed swaption pricing via a Kolmogorov–Arnold network, and analyzes factor exposures and feature importance with SHAP. Collectively, the results demonstrate that RL-based dynamic hedging can deliver more efficient, tail-sensitive, and cost-aware hedging strategies for swaptions in realistic market settings.

Abstract

This paper investigates the deep hedging framework, based on reinforcement learning (RL), for the dynamic hedging of swaptions, contrasting its performance with traditional sensitivity-based rho-hedging. We design agents under three distinct objective functions (mean squared error, downside risk, and Conditional Value-at-Risk) to capture alternative risk preferences and evaluate how these objectives shape hedging styles. Relying on a three-factor arbitrage-free dynamic Nelson-Siegel model for our simulation experiments, our findings show that near-optimal hedging effectiveness is achieved when using two swaps as hedging instruments. Deep hedging strategies dynamically adapt the hedging portfolio's exposure to risk factors across states of the market. In our experiments, their out-performance over rho-hedging strategies persists even in the presence some of model misspecification. These results highlight RL's potential to deliver more efficient and resilient swaption hedging strategies.

Paper Structure

This paper contains 34 sections, 60 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Comparison of kernel density estimates of in-sample (IS) and out-of-sample (OOS) hedging error distributions for RL-based strategies, and out-of-sample distributions for dynamic rho-hedge strategies, across different hedging portfolio compositions. The vertical lines indicate the mean of each distribution, while the black line indicates a null hedging error. The first row corresponds to hedging portfolios with one swap, the second row to two swaps, and the third row to three swaps.
  • Figure 2: Residuals for 2-Swap Portfolio – Multiple Objectives. Shaded areas represent scaled standard deviation bands (0.5$\times$std) for visual clarity
  • Figure 3: Shapley feature importance of hedge swap weights for two hedging strategies (MSE, DR, CVaR).
  • Figure 4: In-sample (left) and out-of-sample (right) performance of KAN and FCNN across 1000 training epochs.