Table of Contents
Fetching ...

Solving The Dynamic Volatility Fitting Problem: A Deep Reinforcement Learning Approach

Emmanuel Gnabeyeu, Omar Karkar, Imad Idboufous

Abstract

The volatility fitting is one of the core problems in the equity derivatives business. Through a set of deterministic rules, the degrees of freedom in the implied volatility surface encoding (parametrization, density, diffusion) are defined. Whilst very effective, this approach widespread in the industry is not natively tailored to learn from shifts in market regimes and discover unsuspected optimal behaviors. In this paper, we change the classical paradigm and apply the latest advances in Deep Reinforcement Learning(DRL) to solve the fitting problem. In particular, we show that variants of Deep Deterministic Policy Gradient (DDPG) and Soft Actor Critic (SAC) can achieve at least as good as standard fitting algorithms. Furthermore, we explain why the reinforcement learning framework is appropriate to handle complex objective functions and is natively adapted for online learning.

Solving The Dynamic Volatility Fitting Problem: A Deep Reinforcement Learning Approach

Abstract

The volatility fitting is one of the core problems in the equity derivatives business. Through a set of deterministic rules, the degrees of freedom in the implied volatility surface encoding (parametrization, density, diffusion) are defined. Whilst very effective, this approach widespread in the industry is not natively tailored to learn from shifts in market regimes and discover unsuspected optimal behaviors. In this paper, we change the classical paradigm and apply the latest advances in Deep Reinforcement Learning(DRL) to solve the fitting problem. In particular, we show that variants of Deep Deterministic Policy Gradient (DDPG) and Soft Actor Critic (SAC) can achieve at least as good as standard fitting algorithms. Furthermore, we explain why the reinforcement learning framework is appropriate to handle complex objective functions and is natively adapted for online learning.

Paper Structure

This paper contains 53 sections, 20 equations, 40 figures, 9 tables, 2 algorithms.

Figures (40)

  • Figure 1: DDPG Framework.
  • Figure 2: A snapshot of the agent acting in the market: The old volatility surface is bumped to a new one following the move of the market, conditional on prior volatility surface.
  • Figure 3: Replay Memory and Noise in the DDPG framework in the static and sequential toy problems.
  • Figure 4: Implied volatility snapshot in a "High Smile" configuration with DDPG: A Monte-Carlo on 5 differents random seeds is performed with a power decaying noise. We represent in ($\textcolor{green}{green}$) the best response of the agent amongst the last 1000 episodes and in ($\textcolor{blue}{blue}$) the mean of the last 1000 volatility slices.
  • Figure 5: Implied volatility snapshot in a "High Smile" configuration with SAC: A Monte-Carlo on 5 different random seeds is performed with automatic entropy adjustement. We represent in ($\textcolor{green}{green}$) the best response of the agent amongst the last 1000 episodes and in ($\textcolor{blue}{blue}$) the mean of the last 1000 volatility slices.
  • ...and 35 more figures