Table of Contents
Fetching ...

Strategic learning for disturbance rejection in multi-agent systems: Nash and Minmax in graphical games

Xinyang Wang, Martin Guay, Shimin Wang, Hongwei Zhang

TL;DR

This work addresses disturbance rejection in discrete-time multi-agent systems by formulating cooperative and non-cooperative graphical games and solving them with model-free Q-function policy iteration. The authors develop online learning architectures—actor-disturber-critic for the Nash-equilibrium case and actor-disturber-adversary-critic for the distributed minmax case—then prove stability and convergence to approximate solutions under suitable gain and excitation conditions. The approach eliminates the need for exact system models, and simulations with leader-follower networks demonstrate synchronization and robustness to $L_2$-bounded disturbances. Overall, the paper provides a rigorous, scalable framework for distributed disturbance rejection in MAS using Q-function based online learning in graphical games.

Abstract

This article investigates the optimal control problem with disturbance rejection for discrete-time multi-agent systems under cooperative and non-cooperative graphical games frameworks. Given the practical challenges of obtaining accurate models, Q-function-based policy iteration methods are proposed to seek the Nash equilibrium solution for the cooperative graphical game and the distributed minmax solution for the non-cooperative graphical game. To implement these methods online, two reinforcement learning frameworks are developed, an actor-disturber-critic structure for the cooperative graphical game and an actor-adversary-disturber-critic structure for the non-cooperative graphical game. The stability of the proposed methods is rigorously analyzed, and simulation results are provided to illustrate the effectiveness of the proposed methods.

Strategic learning for disturbance rejection in multi-agent systems: Nash and Minmax in graphical games

TL;DR

This work addresses disturbance rejection in discrete-time multi-agent systems by formulating cooperative and non-cooperative graphical games and solving them with model-free Q-function policy iteration. The authors develop online learning architectures—actor-disturber-critic for the Nash-equilibrium case and actor-disturber-adversary-critic for the distributed minmax case—then prove stability and convergence to approximate solutions under suitable gain and excitation conditions. The approach eliminates the need for exact system models, and simulations with leader-follower networks demonstrate synchronization and robustness to -bounded disturbances. Overall, the paper provides a rigorous, scalable framework for distributed disturbance rejection in MAS using Q-function based online learning in graphical games.

Abstract

This article investigates the optimal control problem with disturbance rejection for discrete-time multi-agent systems under cooperative and non-cooperative graphical games frameworks. Given the practical challenges of obtaining accurate models, Q-function-based policy iteration methods are proposed to seek the Nash equilibrium solution for the cooperative graphical game and the distributed minmax solution for the non-cooperative graphical game. To implement these methods online, two reinforcement learning frameworks are developed, an actor-disturber-critic structure for the cooperative graphical game and an actor-adversary-disturber-critic structure for the non-cooperative graphical game. The stability of the proposed methods is rigorously analyzed, and simulation results are provided to illustrate the effectiveness of the proposed methods.

Paper Structure

This paper contains 19 sections, 7 theorems, 121 equations, 6 figures, 2 algorithms.

Key Result

lemma 1

*zhang2012adaptiveUnder Assumption a1, the global disagreement vector $\epsilon_k$ is bounded by $\lVert \delta_k\rVert /\underline{\sigma}((L+G)\otimes I_n)$.

Figures (6)

  • Figure 1: Topology structure of MAS
  • Figure 2: Profiles of states and synchronization errors in the cooperative graphical game
  • Figure 3: Neural network weights update process in the cooperative graphical game
  • Figure 5: Nash equilibrium
  • Figure 6: Profiles of states and synchronization errors in the non-cooperative graphical game
  • ...and 1 more figures

Theorems & Definitions (19)

  • lemma 1
  • definition 1
  • lemma 2
  • proof
  • definition 2
  • lemma 3
  • proof
  • remark 1
  • Theorem 1
  • proof
  • ...and 9 more