Decentralized and Lifelong-Adaptive Multi-Agent Collaborative Learning

Shuo Tang; Rui Ye; Chenxin Xu; Xiaowen Dong; Siheng Chen; Yanfeng Wang

Decentralized and Lifelong-Adaptive Multi-Agent Collaborative Learning

Shuo Tang, Rui Ye, Chenxin Xu, Xiaowen Dong, Siheng Chen, Yanfeng Wang

TL;DR

DeLAMA tackles decentralized lifelong-adaptive multi-agent collaboration by learning dynamic collaboration graphs and leveraging a memory unit to retain knowledge, all grounded in a probabilistic MAP framework and enhanced through algorithm unrolling. The method alternates between local learning, graph-based relational inference, and lifelong model updates, with a theoretical analysis showing convex optimization properties, fast graph-convergence, and linear model-update convergence. Empirically, DeLAMA delivers substantial gains across regression, image classification, multi-robot mapping, and human-involved experiments, outperforming centralized, federated, and static-graph baselines while operating under decentralized communication constraints. The approach advances intelligent, decentralized, and dynamic multi-agent systems, offering a scalable blueprint for future distributed learning with adaptive collaboration patterns.

Abstract

Decentralized and lifelong-adaptive multi-agent collaborative learning aims to enhance collaboration among multiple agents without a central server, with each agent solving varied tasks over time. To achieve efficient collaboration, agents should: i) autonomously identify beneficial collaborative relationships in a decentralized manner; and ii) adapt to dynamically changing task observations. In this paper, we propose DeLAMA, a decentralized multi-agent lifelong collaborative learning algorithm with dynamic collaboration graphs. To promote autonomous collaboration relationship learning, we propose a decentralized graph structure learning algorithm, eliminating the need for external priors. To facilitate adaptation to dynamic tasks, we design a memory unit to capture the agents' accumulated learning history and knowledge, while preserving finite storage consumption. To further augment the system's expressive capabilities and computational efficiency, we apply algorithm unrolling, leveraging the advantages of both mathematical optimization and neural networks. This allows the agents to `learn to collaborate' through the supervision of training tasks. Our theoretical analysis verifies that inter-agent collaboration is communication efficient under a small number of communication rounds. The experimental results verify its ability to facilitate the discovery of collaboration strategies and adaptation to dynamic learning scenarios, achieving a 98.80% reduction in MSE and a 188.87% improvement in classification accuracy. We expect our work can serve as a foundational technique to facilitate future works towards an intelligent, decentralized, and dynamic multi-agent system. Code is available at https://github.com/ShuoTang123/DeLAMA.

Decentralized and Lifelong-Adaptive Multi-Agent Collaborative Learning

TL;DR

Abstract

Paper Structure (48 sections, 6 theorems, 68 equations, 15 figures, 7 tables, 1 algorithm)

This paper contains 48 sections, 6 theorems, 68 equations, 15 figures, 7 tables, 1 algorithm.

Introduction
Problem Formulation
Task Setting
Relations to Previous Task Settings
Optimization for Decentralized and Lifelong-Adaptive Collaborative Learning
Optimization Problem
Solution Overview for The Optimization Problem
Local Learning
Collaborative Relational Inference
Lifelong Model Update
Algorithm Unrolling for Decentralized and Lifelong-Adaptive Multi-Agent Learning
Unrolling Network Design
Training Details
Learning to learn the collaboration strategy
Agent's model learning
...and 33 more sections

Key Result

Theorem 1

Let $\mathcal{L}\left(f_{\boldsymbol{\theta}_i^{(t)}}\left(\mathbf{X}_i^{(t)}\right), \mathbf{Y}_i^{(t)}\right)$ be the standard supervised training loss function described by mean square error or cross-entropy of the linear model $f_{\boldsymbol{\theta}_i^{(t)}}(\cdot)$. Suppose $\mathbf{H}\left(\b

Figures (15)

Figure 1: The illustration of our decentralized and lifelong-adaptive multi-agent collaborative learning system. We demonstrate a collaboration system with six agents, each faced with a learning task. The learning tasks' configurations between the agents are dynamic, making the collaboration relationships dynamically adapt to the time-evolving tasks.
Figure 2: Relationships among our lifelong collaborative learning and its related works: lifelong learning, federated learning, and multi-task relation learning.
Figure 3: The decentralized and lifelong-adaptive multi-agent collaborative learning system. The system consists of three steps: local learning, collaborative relational inference, and lifelong model update. In the first step, each agent learns the model parameters based on their own observations to prepare the initialization of model parameters. In the second step, the agents transmit their own model parameters along the communication graph and learn the collaboration relation. In the last step, agents share their model parameters along the learned collaboration relation with their collaborators and refine their model parameters. Note the second and third steps are iterated several times until convergent.
Figure 4: The unrolled network structure of the collaboration system with input data at one exact time. For each agent, the training data is firstly passed into a feed-forward network to be transformed into embeddings, and then this embedding is used to calculate the initialized model parameters according to $\mathbf{\Phi}_{\rm local}(\cdot)$. Then the agents start to communicate and collaborate to find proper parameters according to the iterations shown in $\mathbf{\Phi}_{\rm graph}(\cdot)$ and $\mathbf{\Phi}_{\rm param}(\cdot)$. Finally, the network $\mathcal{F}_{\gamma}(\cdot)$ outputs parameters $\mathbf{\Theta}^{(t)}$. These output model parameters $\mathbf{\Theta}^{(t)}$ are supervised by the data $(\mathcal{X}, \mathcal{Y}) \sim \mathcal{T}_i^{train}$. The training process of the network $\mathcal{F}_{\gamma}(\cdot)$ is learning to tune the parameter $\boldsymbol{\theta}$ to find the optimal parameter learning strategies of $\mathbf{\Theta}^{(t)}$.
Figure 5: Visualization of the learned graph structure at time $t=3$ compared with the oracle collaboration structure. DeLAMA learns smooth and accurate edge weights compared to baseline methods.
...and 10 more figures

Theorems & Definitions (12)

Theorem 1
Theorem 2
Theorem 3
Lemma 1
Proof 1
Proof 2
Lemma 2
Proof 3
Proof 4
Lemma 3
...and 2 more

Decentralized and Lifelong-Adaptive Multi-Agent Collaborative Learning

TL;DR

Abstract

Decentralized and Lifelong-Adaptive Multi-Agent Collaborative Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (12)