Table of Contents
Fetching ...

Quantum Deep Hedging

El Amine Cherrat, Snehal Raj, Iordanis Kerenidis, Abhishek Shekhar, Ben Wood, Jon Dee, Shouvanik Chakrabarti, Richard Chen, Dylan Herman, Shaohan Hu, Pierre Minssen, Ruslan Shaydulin, Yue Sun, Romina Yalovetzky, Marco Pistoia

TL;DR

This work addresses hedging in financial markets by marrying deep reinforcement learning with quantum computing, introducing quantum neural networks built from orthogonal and compound layers. It develops both classical-environment quantum RL (policy-search with orthogonal layers) and quantum-environment quantum RL (distributional actor-critic with compound layers), and proves trainability and parameter efficiency in the quantum setting. Through extensive simulations and hardware experiments on trapped-ion processors up to 16 qubits, it shows that quantum models can achieve comparable or better performance with fewer parameters, and that distributional quantum methods outperform standard baselines. The resulting framework is general, scalable to other RL problems, and highlights the practical potential of quantum approaches for finance and beyond.

Abstract

Quantum machine learning has the potential for a transformative impact across industry sectors and in particular in finance. In our work we look at the problem of hedging where deep reinforcement learning offers a powerful framework for real markets. We develop quantum reinforcement learning methods based on policy-search and distributional actor-critic algorithms that use quantum neural network architectures with orthogonal and compound layers for the policy and value functions. We prove that the quantum neural networks we use are trainable, and we perform extensive simulations that show that quantum models can reduce the number of trainable parameters while achieving comparable performance and that the distributional approach obtains better performance than other standard approaches, both classical and quantum. We successfully implement the proposed models on a trapped-ion quantum processor, utilizing circuits with up to $16$ qubits, and observe performance that agrees well with noiseless simulation. Our quantum techniques are general and can be applied to other reinforcement learning problems beyond hedging.

Quantum Deep Hedging

TL;DR

This work addresses hedging in financial markets by marrying deep reinforcement learning with quantum computing, introducing quantum neural networks built from orthogonal and compound layers. It develops both classical-environment quantum RL (policy-search with orthogonal layers) and quantum-environment quantum RL (distributional actor-critic with compound layers), and proves trainability and parameter efficiency in the quantum setting. Through extensive simulations and hardware experiments on trapped-ion processors up to 16 qubits, it shows that quantum models can achieve comparable or better performance with fewer parameters, and that distributional quantum methods outperform standard baselines. The resulting framework is general, scalable to other RL problems, and highlights the practical potential of quantum approaches for finance and beyond.

Abstract

Quantum machine learning has the potential for a transformative impact across industry sectors and in particular in finance. In our work we look at the problem of hedging where deep reinforcement learning offers a powerful framework for real markets. We develop quantum reinforcement learning methods based on policy-search and distributional actor-critic algorithms that use quantum neural network architectures with orthogonal and compound layers for the policy and value functions. We prove that the quantum neural networks we use are trainable, and we perform extensive simulations that show that quantum models can reduce the number of trainable parameters while achieving comparable performance and that the distributional approach obtains better performance than other standard approaches, both classical and quantum. We successfully implement the proposed models on a trapped-ion quantum processor, utilizing circuits with up to qubits, and observe performance that agrees well with noiseless simulation. Our quantum techniques are general and can be applied to other reinforcement learning problems beyond hedging.
Paper Structure (33 sections, 4 theorems, 62 equations, 7 figures, 6 tables, 3 algorithms)

This paper contains 33 sections, 4 theorems, 62 equations, 7 figures, 6 tables, 3 algorithms.

Key Result

Theorem 1

Consider any $n$-qubit variational form with output function given by $C(\boldsymbol{\theta}) = \text{Tr}(O\Pi_{j=L}^{1}U_j(\theta_j)\rho_{\text{in}}\Pi_{j=1}^{L}U_j(\theta_j)^{\dagger})$, where $O$ is some $n$-qubit observable, and each $U_j$ is expressible as the product of a constant number of pa when each $\theta_j$ is initialized from a normal distribution $\mathcal{N}(0,\gamma^2)$ with $\gam

Figures (7)

  • Figure 1: A quantum circuit with logarithmic depth for data loading. Vertical lines represent RBS gates with parameters that are dependent on the input $\boldsymbol{x}$. The unitary represented by this data loader is denoted as $U_{L}(\boldsymbol{x})$.
  • Figure 2: Various Hamming-weight preserving circuits used in quantum orthogonal layers. These circuits are parameterized by a set of parameters $\boldsymbol{\theta}$, with each parameter representing the angle of a specific RBS gate. The parameterized unitary represented by this layer is expressed as $U(\boldsymbol{\theta})$.
  • Figure 3: A quantum compound layer $U(\boldsymbol{\theta})$ acts as a block diagonal unitary on each fixed Hamming-weight subspace.
  • Figure 4: Diverse quantum neural network architectures for time-series data, featuring orthogonal layers in each block as outlined in Section \ref{['subsec:qnn-archs-for-timeseries']}. Here, $\boldsymbol{x}_t$ and $\boldsymbol{y}_t$ denote the time-series input and output, respectively, while $\tilde{\boldsymbol{y}}_t$ represents the output after being adjusted by the attention mechanism.
  • Figure 5: A quantum compound neural network. $U_{L}(\boldsymbol{x})$ refers to a general data loader unitary. $U(\boldsymbol{\theta})$ denotes a Hamming-weight preserving unitary as for example the ones shown in Figure \ref{['fig:orthogonal-layers']}.
  • ...and 2 more figures

Theorems & Definitions (10)

  • Definition 1: Finite-horizon MDP
  • Theorem 1: Paraphrased from zhang_escaping_2022
  • Theorem 2
  • proof
  • Definition 2: Classical MDP for Deep Hedging
  • Definition 3: Cramér distance
  • Proposition 1
  • proof
  • Proposition 2
  • proof