Table of Contents
Fetching ...

The outcomes of generative AI are exactly the Nash equilibria of a non-potential game

Boualem Djehiche, Hamidou Tembine

Abstract

In this article we show that the asymptotic outcomes of both shallow and deep neural networks such as those used in BloombergGPT to generate economic time series are exactly the Nash equilibria of a non-potential game. We then design and analyze deep neural network algorithms that converge to these equilibria. The methodology is extended to federated deep neural networks between clusters of regional servers and on-device clients. Finally, the variational inequalities behind large language models including encoder-decoder related transformers are established.

The outcomes of generative AI are exactly the Nash equilibria of a non-potential game

Abstract

In this article we show that the asymptotic outcomes of both shallow and deep neural networks such as those used in BloombergGPT to generate economic time series are exactly the Nash equilibria of a non-potential game. We then design and analyze deep neural network algorithms that converge to these equilibria. The methodology is extended to federated deep neural networks between clusters of regional servers and on-device clients. Finally, the variational inequalities behind large language models including encoder-decoder related transformers are established.
Paper Structure (12 sections, 10 theorems, 66 equations, 1 figure, 2 tables)

This paper contains 12 sections, 10 theorems, 66 equations, 1 figure, 2 tables.

Key Result

proposition 1

Consider a deep neural network $(x_0,R,W,b,L, \{\mathcal{H}_l\}_{1\leq l\leq L})$ with $R_{l,t}$ being $1$-Lipschitz continuous (i.e. with Lipschitz constant 1) for all layers (input, hidden and output) and time, for any $k\in \{1,\ldots, L\}$ and $T \geq 0$, where Then, independently of the starting signal input $x_0$, there is a strong convergence of the sequence $x_t$ in (dl1) to a unique fixe

Figures (1)

  • Figure 1: A schematic representation of the deep learning architecture considered in this paper. $O^{(l)}_k:=O_{l,t}(k)$ displayed in Eq. \ref{['O-l-t']}. The integers $m,n,p,d$ may be different.

Theorems & Definitions (16)

  • definition 1
  • proposition 1
  • definition 2
  • proposition 2
  • remark 1
  • lemma 1
  • lemma 2: see Theorem 3.1 in nopot and Chap. 5 in bc
  • remark 2
  • remark 3
  • remark 4
  • ...and 6 more