Table of Contents
Fetching ...

A Distributed Primal-Dual Method for Constrained Multi-agent Reinforcement Learning with General Parameterization

Ali Kahe, Hamed Kebriaei

TL;DR

This paper develops a distributed primal-dual algorithm based on actor-critic methods, leveraging local information to estimate Lagrangian multipliers and establishes consensus among the Lagrangian multipliers across agents and proves the convergence of the algorithm to an equilibrium point.

Abstract

This paper proposes a novel distributed approach for solving a cooperative Constrained Multi-agent Reinforcement Learning (CMARL) problem, where agents seek to minimize a global objective function subject to shared constraints. Unlike existing methods that rely on centralized training or coordination, our approach enables fully decentralized online learning, with each agent maintaining local estimates of both primal and dual variables. Specifically, we develop a distributed primal-dual algorithm based on actor-critic methods, leveraging local information to estimate Lagrangian multipliers. We establish consensus among the Lagrangian multipliers across agents and prove the convergence of our algorithm to an equilibrium point, analyzing the sub-optimality of this equilibrium compared to the exact solution of the unparameterized problem. Furthermore, we introduce a constrained cooperative Cournot game with stochastic dynamics as a test environment to evaluate the algorithm's performance in complex, real-world scenarios.

A Distributed Primal-Dual Method for Constrained Multi-agent Reinforcement Learning with General Parameterization

TL;DR

This paper develops a distributed primal-dual algorithm based on actor-critic methods, leveraging local information to estimate Lagrangian multipliers and establishes consensus among the Lagrangian multipliers across agents and proves the convergence of the algorithm to an equilibrium point.

Abstract

This paper proposes a novel distributed approach for solving a cooperative Constrained Multi-agent Reinforcement Learning (CMARL) problem, where agents seek to minimize a global objective function subject to shared constraints. Unlike existing methods that rely on centralized training or coordination, our approach enables fully decentralized online learning, with each agent maintaining local estimates of both primal and dual variables. Specifically, we develop a distributed primal-dual algorithm based on actor-critic methods, leveraging local information to estimate Lagrangian multipliers. We establish consensus among the Lagrangian multipliers across agents and prove the convergence of our algorithm to an equilibrium point, analyzing the sub-optimality of this equilibrium compared to the exact solution of the unparameterized problem. Furthermore, we introduce a constrained cooperative Cournot game with stochastic dynamics as a test environment to evaluate the algorithm's performance in complex, real-world scenarios.

Paper Structure

This paper contains 8 sections, 12 theorems, 38 equations, 2 figures, 1 algorithm.

Key Result

Proposition 1

Under Assumptions non_zero policy-actor convergence assumption, for a given locally estimated Lagrange multipliers $\hat{\lambda}$, the distributed actor-critic algorithm with linear critic approximation and local update rules critic update-actor_update, when applied to minimize the decomposed Lagra

Figures (2)

  • Figure 1: Convergence of locally estimated Lagrangian multipliers during training.
  • Figure 2: Global objective cost ($J$) and global constraint cost ($\hat{G} - b$) during training. The plot illustrates how the algorithm seeks to decrease the objective cost while maintaining constraint violations near zero.

Theorems & Definitions (17)

  • Remark 1
  • Proposition 1
  • Remark 2
  • Lemma 1
  • Theorem 1
  • Remark 3
  • Lemma 2
  • Theorem 2
  • Proposition 2
  • Lemma 3
  • ...and 7 more