Table of Contents
Fetching ...

Stackelberg Meta-Learning Based Control for Guided Cooperative LQG Systems

Yuhan Zhao, Quanyan Zhu

TL;DR

The paper addresses guided cooperative control under incomplete follower information by formulating a dynamic Stackelberg game and introducing a meta-learning framework to acquire a transferable follower-response model. A parametric follower response is embedded into an augmented LQG (via $ ilde{A}$ and $ ilde{B}$) and optimized through a bilevel meta-learning scheme that trains on multiple follower types and adapts to new ones with limited data. The approach combines a data-fitting term with leader-cost optimization through a Riccati-based gradient pipeline, including a data-sampling strategy for robust meta-training and a practical adaptation rule for new followers. Experiments in robot teaming demonstrate improved transferability and guidance efficiency over unilateral and individual learning, highlighting the practical impact for heterogeneous multi-agent control tasks.

Abstract

Guided cooperation allows intelligent agents with heterogeneous capabilities to work together by following a leader-follower type of interaction. However, the associated control problem becomes challenging when the leader agent does not have complete information about follower agents. There is a need for learning and adaptation of cooperation plans. To this end, we develop a meta-learning-based Stackelberg game-theoretic framework to address the challenges in the guided cooperative control for linear systems. We first formulate the guided cooperation between agents as a dynamic Stackelberg game and use the feedback Stackelberg equilibrium as the agent-wise cooperation strategy. We further leverage meta-learning to address the incomplete information of follower agents, where the leader agent learns a meta-response model from a prescribed set of followers offline and adapts to a new coming cooperation task with a small amount of learning data. We use a case study in robot teaming to corroborate the effectiveness of our framework. Comparison with other learning approaches also shows that our learned cooperation strategy provides better transferability for different cooperation tasks.

Stackelberg Meta-Learning Based Control for Guided Cooperative LQG Systems

TL;DR

The paper addresses guided cooperative control under incomplete follower information by formulating a dynamic Stackelberg game and introducing a meta-learning framework to acquire a transferable follower-response model. A parametric follower response is embedded into an augmented LQG (via and ) and optimized through a bilevel meta-learning scheme that trains on multiple follower types and adapts to new ones with limited data. The approach combines a data-fitting term with leader-cost optimization through a Riccati-based gradient pipeline, including a data-sampling strategy for robust meta-training and a practical adaptation rule for new followers. Experiments in robot teaming demonstrate improved transferability and guidance efficiency over unilateral and individual learning, highlighting the practical impact for heterogeneous multi-agent control tasks.

Abstract

Guided cooperation allows intelligent agents with heterogeneous capabilities to work together by following a leader-follower type of interaction. However, the associated control problem becomes challenging when the leader agent does not have complete information about follower agents. There is a need for learning and adaptation of cooperation plans. To this end, we develop a meta-learning-based Stackelberg game-theoretic framework to address the challenges in the guided cooperative control for linear systems. We first formulate the guided cooperation between agents as a dynamic Stackelberg game and use the feedback Stackelberg equilibrium as the agent-wise cooperation strategy. We further leverage meta-learning to address the incomplete information of follower agents, where the leader agent learns a meta-response model from a prescribed set of followers offline and adapts to a new coming cooperation task with a small amount of learning data. We use a case study in robot teaming to corroborate the effectiveness of our framework. Comparison with other learning approaches also shows that our learned cooperation strategy provides better transferability for different cooperation tasks.
Paper Structure (22 sections, 19 equations, 4 figures, 1 algorithm)

This paper contains 22 sections, 19 equations, 4 figures, 1 algorithm.

Figures (4)

  • Figure 1: Meta training and adaptation results.
  • Figure 2: Trajectories for $\theta=0$ follower after adaptation.
  • Figure 3: Costs and trajectories for unilateral learning.
  • Figure 4: Results for individual learning.