Table of Contents
Fetching ...

Multi-Agent Relative Investment Games in a Jump Diffusion Market with Deep Reinforcement Learning Algorithm

Liwei Lu, Ruimeng Hu, Xu Yang, Yi Zhu

TL;DR

The authors address multi-agent investment decisions in jump-diffusion markets by deriving semi-explicit constant Nash equilibria under CARA and CRRA utilities and by developing a deep reinforcement learning framework to solve stochastic control problems with jumps, extended to differential games via fictitious play. The RL approach uses an actor-critic architecture with Itô-Lévy dynamics, stabilized rewards, and parallelized fictitious-play iterations to learn both value and policy functions, handling control in both diffusion and jump terms. Theoretical results establish existence and uniqueness of constant Nash equilibria for the exponential, power, and logarithmic cases, while numerical experiments (Merton with jumps, high-dimensional LQR, and multi-agent portfolio games) demonstrate accurate convergence to equilibria, scalability to higher dimensions, and substantial speedups from parallel computation. These contributions offer a model-free, data-friendly toolkit for solving complex stochastic games in markets with jumps, with potential extensions to data-driven calibration and convergence analysis.

Abstract

This paper focuses on multi-agent stochastic differential games for jump-diffusion systems. On one hand, we study the multi-agent game for optimal investment in a jump-diffusion market. We derive constant Nash equilibria and provide sufficient conditions for their existence and uniqueness for exponential, power, and logarithmic utilities, respectively. On the other hand, we introduce a computational framework based on the actor-critic method in deep reinforcement learning to solve the stochastic control problem with jumps. We extend this algorithm to address the multi-agent game with jumps and utilize parallel computing to enhance computational efficiency. We present numerical examples of the Merton problem with jumps, linear quadratic regulators, and the optimal investment game under various settings to demonstrate the accuracy, efficiency, and robustness of the proposed method. In particular, neural network solutions numerically converge to the derived constant Nash equilibrium for the multi-agent game.

Multi-Agent Relative Investment Games in a Jump Diffusion Market with Deep Reinforcement Learning Algorithm

TL;DR

The authors address multi-agent investment decisions in jump-diffusion markets by deriving semi-explicit constant Nash equilibria under CARA and CRRA utilities and by developing a deep reinforcement learning framework to solve stochastic control problems with jumps, extended to differential games via fictitious play. The RL approach uses an actor-critic architecture with Itô-Lévy dynamics, stabilized rewards, and parallelized fictitious-play iterations to learn both value and policy functions, handling control in both diffusion and jump terms. Theoretical results establish existence and uniqueness of constant Nash equilibria for the exponential, power, and logarithmic cases, while numerical experiments (Merton with jumps, high-dimensional LQR, and multi-agent portfolio games) demonstrate accurate convergence to equilibria, scalability to higher dimensions, and substantial speedups from parallel computation. These contributions offer a model-free, data-friendly toolkit for solving complex stochastic games in markets with jumps, with potential extensions to data-driven calibration and convergence analysis.

Abstract

This paper focuses on multi-agent stochastic differential games for jump-diffusion systems. On one hand, we study the multi-agent game for optimal investment in a jump-diffusion market. We derive constant Nash equilibria and provide sufficient conditions for their existence and uniqueness for exponential, power, and logarithmic utilities, respectively. On the other hand, we introduce a computational framework based on the actor-critic method in deep reinforcement learning to solve the stochastic control problem with jumps. We extend this algorithm to address the multi-agent game with jumps and utilize parallel computing to enhance computational efficiency. We present numerical examples of the Merton problem with jumps, linear quadratic regulators, and the optimal investment game under various settings to demonstrate the accuracy, efficiency, and robustness of the proposed method. In particular, neural network solutions numerically converge to the derived constant Nash equilibrium for the multi-agent game.
Paper Structure (19 sections, 7 theorems, 110 equations, 8 figures, 5 tables, 2 algorithms)

This paper contains 19 sections, 7 theorems, 110 equations, 8 figures, 5 tables, 2 algorithms.

Key Result

Theorem 2.2

Assume $\mu_i>0$, $\nu_i\geq0$, $\sigma_i\geq0$, $\nu_i^2+\sigma_i^2>0$, $\delta_i>0$ and $\theta_i \in (0,1)$, then any constant Nash equilibrium $(\pi_1^*,\cdots,\pi_n^*)$ satisfies the coupled system: where $\widehat{\pi^*\sigma} := \frac{1}{n} \sum_{k\neq i} \pi^*_k \sigma_k$ and $\widehat{\pi^*\beta} := \frac{1}{n} \sum_{k\neq i} \pi^*_k \beta_k$. If system eq.thm_exp has a unique solution,

Figures (8)

  • Figure 1: The structure for the actor $u(t,x)$, the critic $J^u(t,x)$ and the non-local term $\int_{\mathbb{R}^d}(J^u(t,x+G(x,z,u))-J^u(t,x))\nu(\mathop{}\!\mathrm{d} z)$ employed in this work.
  • Figure 2: The diagram of solving stochastic control problems with jumps using deep reinforcement learning.
  • Figure 3: The illustration for solving multi-agent games with jumps using parallel computing. In each iteration of the outer loop, $n$ agents simultaneously perform computations. Each agent instantiates local networks for the critic, actor, and non-local terms for subsequent computation and then uploads the network parameters back to the global networks.
  • Figure 4: Merton's problem with jumps under power utility. (a)--(c) Visualization of sample trajectories of the value function along the optimal state process $v(t, X_t)$, the control process $u^\ast(t, X_t)$, and the state process $X_t$, as well as their approximated counterparts; (d) Plot of the CriticLoss\ref{['eq.CriticLoss']} and the ActorLoss\ref{['eq.ActorLoss']} with respect to the training iterations; (e) Plot of the Error_value\ref{['def.error.value']} and Error_control\ref{['def.error.control']} with respect to the training iterations; and (f) Heatmap of the $L^1$ relative error of the approximated control $\hat{u}(t,x)$ as a bivariate function of $t$ and $x$.
  • Figure 5: Merton's problem with jumps under power utility. (a) $L^2$ relative errors of the value function $e_t^v$ and the control $e_t^u$ at different times $t$; (b) Plot showing the evolution of the control function's bound $b$ (a trainable parameter in $\mathcal{N}_\pi$) with respect to training iterations under different initializations.
  • ...and 3 more figures

Theorems & Definitions (13)

  • Definition 2.1: Nash equilibrum
  • Theorem 2.2
  • proof
  • Remark 2.3
  • Corollary 2.4
  • proof
  • Corollary 2.5
  • Theorem 2.6
  • proof
  • Corollary 2.7
  • ...and 3 more