Locally Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Dependencies

Alex DeWeese; Guannan Qu

Locally Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Dependencies

Alex DeWeese, Guannan Qu

TL;DR

The paper introduces Locally Interdependent Multi-Agent MDPs to model decentralized agents with dynamically changing dependencies driven by proximity, where agents within distance $\mathcal{R}$ influence rewards and those within $\mathcal{V}$ can communicate. It develops three closed-form, group-decentralized policies—Amalgam, Cutoff, and First Step Finite Horizon—and proves near-optimal guarantees with bounds of the form $|V^*(s)-V^{\text{policy}}(s)|\le C\gamma^{c+1}\tilde r$, where $c=\left\lfloor\frac{\mathcal{V}-\mathcal{R}}{2}\right\rfloor$ and $\tilde r$ captures reward magnitude. A corresponding lower bound shows these results are tight up to constants, and a Telescoping Lemma establishes how to convert naive policy analyses into the final guarantees. The framework further offers scalable extensions (e.g., eliminating, splitting, or approximating large groups) and demonstrates long-horizon behavior via simulations in cooperative navigation, obstacle avoidance, and formation control. This work provides a theoretically grounded, scalable approach to decentralized RL in settings with dynamic dependencies among agents.

Abstract

Many multi-agent systems in practice are decentralized and have dynamically varying dependencies. There has been a lack of attempts in the literature to analyze these systems theoretically. In this paper, we propose and theoretically analyze a decentralized model with dynamically varying dependencies called the Locally Interdependent Multi-Agent MDP. This model can represent problems in many disparate domains such as cooperative navigation, obstacle avoidance, and formation control. Despite the intractability that general partially observable multi-agent systems suffer from, we propose three closed-form policies that are theoretically near-optimal in this setting and can be scalable to compute and store. Consequentially, we reveal a fundamental property of Locally Interdependent Multi-Agent MDP's that the partially observable decentralized solution is exponentially close to the fully observable solution with respect to the visibility radius. We then discuss extensions of our closed-form policies to further improve tractability. We conclude by providing simulations to investigate some long horizon behaviors of our closed-form policies.

Locally Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Dependencies

TL;DR

The paper introduces Locally Interdependent Multi-Agent MDPs to model decentralized agents with dynamically changing dependencies driven by proximity, where agents within distance

influence rewards and those within

can communicate. It develops three closed-form, group-decentralized policies—Amalgam, Cutoff, and First Step Finite Horizon—and proves near-optimal guarantees with bounds of the form

, where

and

captures reward magnitude. A corresponding lower bound shows these results are tight up to constants, and a Telescoping Lemma establishes how to convert naive policy analyses into the final guarantees. The framework further offers scalable extensions (e.g., eliminating, splitting, or approximating large groups) and demonstrates long-horizon behavior via simulations in cooperative navigation, obstacle avoidance, and formation control. This work provides a theoretically grounded, scalable approach to decentralized RL in settings with dynamic dependencies among agents.

Abstract

Paper Structure (52 sections, 18 theorems, 34 equations, 10 figures)

This paper contains 52 sections, 18 theorems, 34 equations, 10 figures.

Introduction
Contributions
Related Work
Preliminaries
Locally Interdependent Multi-Agent MDP
Group Decentralized Policies
Scalability
Objectives
Properties
Applications
Cooperative Navigation
Obstacle Avoidance
Formation Control
Main Results
Three Upper Bound Constructions
...and 37 more sections

Key Result

Theorem 3.1

$\lvert V^*(s) - V^{\lambda} (s) \rvert \leq \frac{2}{(1 - \gamma)^2}\gamma^{c + 1} \tilde{r}$.

Figures (10)

Figure 1: 3 agents moving in the space of $\mathcal{X} = \mathbb{R}^2$ with standard Euclidean distance. The bottom two agents potentially have an interdependent reward since they are within distance $\mathcal{R}$ of one another. Furthermore, every agent is within distance $\mathcal{V}$ of another agent so all agents can communicate with each other. Notably, the top and bottom agents may communicate even though they are not within distance $\mathcal{V}$ of each other.
Figure 2: Bullseye Problem: In red is the optimal policy with a discounted sum of rewards of $8.85$. The top three in blue are Amalgam Policy rollouts with $\mathcal{V}=25$,$\mathcal{\mathcal{}} V=35$, $\mathcal{V}=45$ top to bottom. They have a total discounted reward of $6.74$, $8.26$, and $8.85$ respectively. Therefore, $\lvert V^*(s) - V^{\lambda}(s)\rvert$ is $2.11$, $0.59$, $0$ respectively. In green is the Cutoff Policy with $\mathcal{V} = 25$. It obtains a discounted reward of $-5.38$. All reported discounted sum of rewards are rounded to the second decimal place.
Figure 3: Aisle Walk Problem: In red is the optimal policy with a discounted reward of $496.84$, In blue is the Amalgam Policy with a discounted reward of $234.40$, and in green is the Cutoff Policy with a discounted reward of $400$. All reported discounted sum of rewards are rounded to the second decimal place.
Figure 4: Highway Problem with Amalgam and Optimal Policy: In red is the optimal policy with a discounted reward of 73.5 and in blue is the Amalgam Policy with 70.93 rounded to the second decimal place.
Figure 5: Highway Problem with Cutoff Policy: In green is the Cutoff Policy with an accumulated discounted reward of $0$.
...and 5 more figures

Theorems & Definitions (35)

Theorem 3.1
Theorem 3.2
Theorem 3.3
Theorem 3.4
Corollary 3.5
Lemma 4.1
proof
Lemma 4.2
Lemma 4.3
Lemma 4.4
...and 25 more

Locally Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Dependencies

TL;DR

Abstract

Locally Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Dependencies

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (35)