Robust synchronization and policy adaptation for networked heterogeneous agents

Miguel F. Arevalo-Castiblanco; Eduardo Mojica-Nava and; César A. Uribe

Robust synchronization and policy adaptation for networked heterogeneous agents

Miguel F. Arevalo-Castiblanco, Eduardo Mojica-Nava and, César A. Uribe

TL;DR

This paper tackles robust synchronization of leader–follower networks composed of nonlinear heterogeneous agents in the presence of model uncertainties and actuator saturation. It introduces DMSAC-RL, which augments pre-trained RL policies with a distributed adaptive inner loop to handle mismatch and saturations, and extends it to followers via a distributed MRAC framework. Lyapunov-based analysis proves Uniformly Ultimately Bounded ($UUB$) synchronization errors for both leader and follower dynamics in MIMO settings, with input magnitude saturation addressed in an augmented control law. Numerical experiments on pendulum networks and saturated linear MIMO systems demonstrate improved robustness and synchronization compared with RL-alone policies, highlighting practical impact for data-driven control in heterogeneous multi-agent systems.

Abstract

We propose a robust adaptive online synchronization method for leader-follower networks of nonlinear heterogeneous agents with system uncertainties and input magnitude saturation. Synchronization is achieved using a Distributed input Magnitude Saturation Adaptive Control with Reinforcement Learning (DMSAC-RL), which improves the empirical performance of policies trained on off-the-shelf models using Reinforcement Learning (RL) strategies. The leader observes the performance of a reference model, and followers observe the states and actions of the agents they are connected to, but not the reference model. The leader and followers may differ from the reference model in which the RL control policy was trained. DMSAC-RL uses an internal loop that adjusts the learned policy for the agents in the form of augmented input to solve the distributed control problem, including input-matched uncertainty parameters. We show that the synchronization error of the heterogeneous network is Uniformly Ultimately Bounded (UUB). Numerical analysis of a network of Multiple Input Multiple Output (MIMO) systems supports our theoretical findings.

Robust synchronization and policy adaptation for networked heterogeneous agents

TL;DR

) synchronization errors for both leader and follower dynamics in MIMO settings, with input magnitude saturation addressed in an augmented control law. Numerical experiments on pendulum networks and saturated linear MIMO systems demonstrate improved robustness and synchronization compared with RL-alone policies, highlighting practical impact for data-driven control in heterogeneous multi-agent systems.

Abstract

Paper Structure (9 sections, 5 theorems, 82 equations, 12 figures)

This paper contains 9 sections, 5 theorems, 82 equations, 12 figures.

Introduction
Problem Formulation
DMRAC-RL for MIMO Leader Agents
Distributed Model Reference Adaptive Control with Reinforcement Learning
Input Magnitude Saturation Adaptive Control with Reinforcement Learning
Numerical Analysis
Network of pendulum systems for validation of adaptive control
Dynamic model for magnitude saturation validation
Conclusions

Key Result

Proposition 1

Let Assumptions assum:feedback-mc and asuum:one_agent hold, and consider the leader agent $1$ with dynamics as in nonlinear-system-imu, a reference model with dynamics nl_ref, and the MRAC-RL control law u1_nl_imu2 with adaptive gain laws al_nl2. Then, the synchronization error between the leader ag

Figures (12)

Figure 1: Block diagram DMRAC-RL with one leader and four followers. The model trained with the learning strategy and each system, together with its controller with saturation, are represented.
Figure 2: Response of a reinforcement learning algorithm to systems with variation from the parameters used for training in a Nonlinear pendulum model. Each of the lines represents the percentage variation in the system parameters.
Figure 3: Distributed communication network, represented as a directed graph. The red circle indicates the leader agent. Each agent only has communication with the agents in its neighborhood according to the specified topology.
Figure 4: Synchronization of homogeneous agents with Reinforcement Learning technique. The algorithm policy was trained offline with respect to the reference.
Figure 5: MRAC-RL homogeneous synchronization with input matched uncertainties. The worst agents' response delimits the shaded area, the average response is dotted, and the reference is red.
...and 7 more figures

Theorems & Definitions (13)

Remark 1
Definition 1
Example 1
Proposition 1
proof
Lemma 1
proof
Theorem 2
proof
Corollary 1
...and 3 more

Robust synchronization and policy adaptation for networked heterogeneous agents

TL;DR

Abstract

Robust synchronization and policy adaptation for networked heterogeneous agents

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (13)