Table of Contents
Fetching ...

A Deep Learning Method for Optimal Investment Under Relative Performance Criteria Among Heterogeneous Agents

Mathieu Laurière, Ludovic Tangpi, Xuchen Zhou

TL;DR

This paper develops a deep learning framework to compute Nash equilibria for optimal investment in stochastic graphon games with relative performance criteria. By leveraging forward-backward stochastic differential equations (graphon MKV FBSDEs) and a shooting-based neural network approach, the authors solve continuous-space graphon games and examine how different interaction graphs G shape optimal strategies, wealth dynamics, and utilities. They validate the method with a Black-Scholes baseline and extend to Markovian coefficients, analyzing the impact of graphon structure and risk aversion on outcomes, while providing convergence metrics via exploitability. The work advances scalable, data-driven simulation of heterogeneous, networked agents in finance, enabling nuanced assessment of competitive environments and interaction patterns in large populations.

Abstract

Graphon games have been introduced to study games with many players who interact through a weighted graph of interaction. By passing to the limit, a game with a continuum of players is obtained, in which the interactions are through a graphon. In this paper, we focus on a graphon game for optimal investment under relative performance criteria, and we propose a deep learning method. The method builds upon two key ingredients: first, a characterization of Nash equilibria by forward-backward stochastic differential equations and, second, recent advances of machine learning algorithms for stochastic differential games. We provide numerical experiments on two different financial models. In each model, we compare the effect of several graphons, which correspond to different structures of interactions.

A Deep Learning Method for Optimal Investment Under Relative Performance Criteria Among Heterogeneous Agents

TL;DR

This paper develops a deep learning framework to compute Nash equilibria for optimal investment in stochastic graphon games with relative performance criteria. By leveraging forward-backward stochastic differential equations (graphon MKV FBSDEs) and a shooting-based neural network approach, the authors solve continuous-space graphon games and examine how different interaction graphs G shape optimal strategies, wealth dynamics, and utilities. They validate the method with a Black-Scholes baseline and extend to Markovian coefficients, analyzing the impact of graphon structure and risk aversion on outcomes, while providing convergence metrics via exploitability. The work advances scalable, data-driven simulation of heterogeneous, networked agents in finance, enabling nuanced assessment of competitive environments and interaction patterns in large populations.

Abstract

Graphon games have been introduced to study games with many players who interact through a weighted graph of interaction. By passing to the limit, a game with a continuum of players is obtained, in which the interactions are through a graphon. In this paper, we focus on a graphon game for optimal investment under relative performance criteria, and we propose a deep learning method. The method builds upon two key ingredients: first, a characterization of Nash equilibria by forward-backward stochastic differential equations and, second, recent advances of machine learning algorithms for stochastic differential games. We provide numerical experiments on two different financial models. In each model, we compare the effect of several graphons, which correspond to different structures of interactions.
Paper Structure (27 sections, 4 theorems, 41 equations, 12 figures, 1 table, 2 algorithms)

This paper contains 27 sections, 4 theorems, 41 equations, 12 figures, 1 table, 2 algorithms.

Key Result

Proposition 2.6

Let cdn:type vector and cdn: unconstrained be satisfied. The graphon game described in eq:graphon-obj admits a graphon Nash equilibrium $(\tilde{\pi}^u)_{u\in I}$ such that for almost every $u \in I$ it holds where $(X^u, Y^u, Z^u)_{u\in I}$ solves the following FBSDE system with $(u,t,\omega)\mapsto Z^u_t(\omega)$ measurable and $(X^u,Y^u, Z^u)$ is square integrable. for almost every $u\in I$.

Figures (12)

  • Figure 1: Global NN architecture for the simulation in Algorithm \ref{['alg:training-y0-z-params']}. There is a sub-neural network to approximate $Y_0$ at time $0$. It corresponds to the first column. The green nodes are the intermediate layers of this network, with $h_{0,y}^\ell$ denoting the neurons in the $\ell$-th layer. This NN has $H$ layers and its parameters are denoted by $\vartheta_{y_0}$ in the text. Then, at each time step $t_n$, $n=0,\dots,n^*-1$, there is a sub-network with $H$ layers, inputs $(X_{t_n}^{u^i})_{i = 1,\dots,M}$, and outputs $z_{t_n}(u^i,X_{t_n}^{u^i}; \vartheta_{z_{t_n}})$ as estimations for $(Z_{t_n}^{u^i})_{i = 1,\dots,M}$. There are $n^* -1$ such sub-networks, and each NN corresponds to a column above. The green nodes are the intermediate layers of one such network, with $h_{n,z}^\ell$ denoting the neurons in the $\ell$-th layer at time step $t_{n}$, for the estimation of $Z_{t_n}$. In total, there are $(H+1)(n^* -1)$ layers with free parameters to optimize. These parameters are denoted by $\vartheta_{z}$ in the text. Note that only variables that are direct outputs of neural networks are denoted using small case letters in the graph above, such as $y_0, (z_{t_n})_{n = 0,\dots, n^* - 1}$.
  • Figure 2: Top: The trajectory of the value function $Y_t$ against time $t$ for different labels $u \in I$. Bottom: The projection of the top panels onto the $(t, Y_t)$ plane. For the constant graphon case, darker colors correspond to indices closer to $0$ and lighter colors correspond to indices closer to $1$. For the two-block graphon, the orange trajectories corresponds to the population with indices in $[0, 0.5)$, and the blue trajectories correspond to the population with indices in $[0.5, 1]$. We chose $a = 2$ and $b = 0.5$ for $G_2$. For the star graphon, we chose $\alpha = 0.2$. Thus the orange trajectories corresponds to the $20\%$ of the population that are major players, and the blue trajectories corresponds to the $80\%$ of the population that are minor players.
  • Figure 3: Top: The trajectory of the value function $Y_t$ against time $t$ for different labels $u \in I$. Bottom: The projection of the top panels onto the $(t, Y_t)$ plane. For the min-max graphon, the blue trajectories correspond to the population with indices closer to 0.5. The orange trajectories correspond to the population with indices further away from 0.5 For the power-law graphon, the orange trajectories corresponds to the population with indices in $[0, 0.5)$, and the blue trajectories correspond to the population with indices in $[0.5, 1]$.
  • Figure 4: Equilibrium utilities vs. labels for different graphons
  • Figure 5: Validation errors and validation losses for $G_1$
  • ...and 7 more figures

Theorems & Definitions (11)

  • Definition 2.1: Admissibility
  • Definition 2.3
  • Definition 2.4
  • Proposition 2.6
  • Theorem 2.8
  • Proposition 2.9
  • Remark 3.1
  • Remark 3.2
  • Remark 3.3
  • Proposition 4.1
  • ...and 1 more