Table of Contents
Fetching ...

Deep Hedging with Market Impact

Andrei Neagu, Frédéric Godin, Clarence Simard, Leila Kosseim

TL;DR

This work tackles dynamic hedging under market illiquidity by integrating market-impact dynamics into a deep reinforcement learning framework. It models convex, time-persistent market impacts via a limit-order-book-inspired environment and learns a hedging policy $X^*_{t+1}= ilde{X}(Z_t)$ using a recurrent FFNN trained with policy gradients to minimize the semi-quadratic risk $ ho^{semi}$. The DRL approach accounts for the hedging portfolio value $V_t$ and the drift $oldsymbol{\mu}$, enabling dampened or delayed rebalancing and drift-aware decisions that outperform traditional delta hedging (BSM) and Leland strategies, especially in low-liquidity regimes. The paper further analyzes how impact persistence, liquidity parameters, and pin-risk scenarios shape the learned policy, and argues that a data-driven LOB–informed DRL framework yields more realistic and effective hedging in illiquid markets with practical implications for risk management.

Abstract

Dynamic hedging is the practice of periodically transacting financial instruments to offset the risk caused by an investment or a liability. Dynamic hedging optimization can be framed as a sequential decision problem; thus, Reinforcement Learning (RL) models were recently proposed to tackle this task. However, existing RL works for hedging do not consider market impact caused by the finite liquidity of traded instruments. Integrating such feature can be crucial to achieve optimal performance when hedging options on stocks with limited liquidity. In this paper, we propose a novel general market impact dynamic hedging model based on Deep Reinforcement Learning (DRL) that considers several realistic features such as convex market impacts, and impact persistence through time. The optimal policy obtained from the DRL model is analysed using several option hedging simulations and compared to commonly used procedures such as delta hedging. Results show our DRL model behaves better in contexts of low liquidity by, among others: 1) learning the extent to which portfolio rebalancing actions should be dampened or delayed to avoid high costs, 2) factoring in the impact of features not considered by conventional approaches, such as previous hedging errors through the portfolio value, and the underlying asset's drift (i.e. the magnitude of its expected return).

Deep Hedging with Market Impact

TL;DR

This work tackles dynamic hedging under market illiquidity by integrating market-impact dynamics into a deep reinforcement learning framework. It models convex, time-persistent market impacts via a limit-order-book-inspired environment and learns a hedging policy using a recurrent FFNN trained with policy gradients to minimize the semi-quadratic risk . The DRL approach accounts for the hedging portfolio value and the drift , enabling dampened or delayed rebalancing and drift-aware decisions that outperform traditional delta hedging (BSM) and Leland strategies, especially in low-liquidity regimes. The paper further analyzes how impact persistence, liquidity parameters, and pin-risk scenarios shape the learned policy, and argues that a data-driven LOB–informed DRL framework yields more realistic and effective hedging in illiquid markets with practical implications for risk management.

Abstract

Dynamic hedging is the practice of periodically transacting financial instruments to offset the risk caused by an investment or a liability. Dynamic hedging optimization can be framed as a sequential decision problem; thus, Reinforcement Learning (RL) models were recently proposed to tackle this task. However, existing RL works for hedging do not consider market impact caused by the finite liquidity of traded instruments. Integrating such feature can be crucial to achieve optimal performance when hedging options on stocks with limited liquidity. In this paper, we propose a novel general market impact dynamic hedging model based on Deep Reinforcement Learning (DRL) that considers several realistic features such as convex market impacts, and impact persistence through time. The optimal policy obtained from the DRL model is analysed using several option hedging simulations and compared to commonly used procedures such as delta hedging. Results show our DRL model behaves better in contexts of low liquidity by, among others: 1) learning the extent to which portfolio rebalancing actions should be dampened or delayed to avoid high costs, 2) factoring in the impact of features not considered by conventional approaches, such as previous hedging errors through the portfolio value, and the underlying asset's drift (i.e. the magnitude of its expected return).
Paper Structure (14 sections, 12 equations, 5 figures, 1 table)

This paper contains 14 sections, 12 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Neural Network Structure of Our Proposed Model.
  • Figure 2: Evaluation of hedging positions $X_{t+1}$, at six months-to-maturity ($t=6, T=12, \delta_t=\frac{1}{12}$) with monthly rebalancing for a call option with strike price $K=1000$.
  • Figure 3: Evaluation of the hedging position sequence $\{X_{t+1}\}^{T-1}_{t=0}$ for a simulated underlying asset price sequence $S=\{S_t\}^T_{t=0}$ with monthly rebalancing for a one-year-to-maturity ($T=12,\delta_t=\frac{1}{12}$) call option with strike price $K=1000$.
  • Figure 4: Evaluation of the hedging position sequence $\{X_{t+1}\}^{T-1}_{t=0}$ where the underlying asset price sequence remains at the strike price with monthly rebalancing for a one-year to maturity ($T=12, \delta_t=\frac{1}{12}$) call option with strike price $K=1000$.
  • Figure 5: Evaluation of hedging position sequence $\{X_{t+1}\}^{T-1}_{t=0}$ for a simulated underlying asset price sequence $S=\{S_t\}^T_{t=0}$ with hourly rebalancing for an 8 hour to maturity ($T=8,\delta_t=\frac{1}{8\times252}$) call option with strike price $K=1000$. Market liquidity parameters $\alpha=1.001$ and $\beta=0.999$ are used.