Table of Contents
Fetching ...

LLM-Empowered Agentic MAC Protocols: A Dynamic Stackelberg Game Approach

Renxuan Tan, Rongpeng Li, Fei Wang, Chenghui Peng, Shaoyun Wu, Zhifeng Zhao, Honggang Zhang

TL;DR

This work tackles the challenge of designing adaptive MAC protocols for dynamic wireless networks by framing protocol emergence as a dynamic multi-follower Stackelberg game (MFSG) in which a base station (leader) coordinates with varying user devices (followers). By embedding language-oriented policies within LLMs and coordinating them through PPO, the framework enables semantic, flexible signaling and action generation that scales with changing network size, while enforcing reliability via a protocol action grammar (PAG). Theoretical results guarantee the existence of an expected Stackelberg equilibrium and local convergence of the learning dynamics, and simulations show substantial gains in throughput and fairness (e.g., up to 77.6% throughput improvement and 65.2% fairness improvement) compared to baselines, with strong generalization to fluctuating numbers of UEs without retraining. This approach promises robust, generalizable MAC protocol emergence suitable for next-generation networks and lays groundwork for extending to multi-cell and sensing-integrated settings.

Abstract

Medium Access Control (MAC) protocols, essential for wireless networks, are typically manually configured. While deep reinforcement learning (DRL)-based protocols enhance task-specified network performance, they suffer from poor generalizability and resilience, demanding costly retraining to adapt to dynamic environments. To overcome this limitation, we introduce a game-theoretic LLM-empowered multi-agent DRL (MARL) framework, in which the uplink transmission between a base station and a varying number of user equipments is modeled as a dynamic multi-follower Stackelberg game (MFSG), capturing the network's natural hierarchical structure. Within this game, LLM-driven agents, coordinated through proximal policy optimization (PPO), synthesize adaptive, semantic MAC protocols in response to network dynamics. Protocol action grammar (PAG) is employed to ensure the reliability and efficiency of this process. Under this system, we further analyze the existence and convergence behavior in terms of a Stackelberg equilibrium by studying the learning dynamics of LLM-empowered unified policies in response to changing followers. Simulations corroborate that our framework achieves a 77.6% greater throughput and a 65.2% fairness improvement over conventional baselines. Besides, our framework generalizes excellently to a fluctuating number of users without requiring retraining or architectural changes.

LLM-Empowered Agentic MAC Protocols: A Dynamic Stackelberg Game Approach

TL;DR

This work tackles the challenge of designing adaptive MAC protocols for dynamic wireless networks by framing protocol emergence as a dynamic multi-follower Stackelberg game (MFSG) in which a base station (leader) coordinates with varying user devices (followers). By embedding language-oriented policies within LLMs and coordinating them through PPO, the framework enables semantic, flexible signaling and action generation that scales with changing network size, while enforcing reliability via a protocol action grammar (PAG). Theoretical results guarantee the existence of an expected Stackelberg equilibrium and local convergence of the learning dynamics, and simulations show substantial gains in throughput and fairness (e.g., up to 77.6% throughput improvement and 65.2% fairness improvement) compared to baselines, with strong generalization to fluctuating numbers of UEs without retraining. This approach promises robust, generalizable MAC protocol emergence suitable for next-generation networks and lays groundwork for extending to multi-cell and sensing-integrated settings.

Abstract

Medium Access Control (MAC) protocols, essential for wireless networks, are typically manually configured. While deep reinforcement learning (DRL)-based protocols enhance task-specified network performance, they suffer from poor generalizability and resilience, demanding costly retraining to adapt to dynamic environments. To overcome this limitation, we introduce a game-theoretic LLM-empowered multi-agent DRL (MARL) framework, in which the uplink transmission between a base station and a varying number of user equipments is modeled as a dynamic multi-follower Stackelberg game (MFSG), capturing the network's natural hierarchical structure. Within this game, LLM-driven agents, coordinated through proximal policy optimization (PPO), synthesize adaptive, semantic MAC protocols in response to network dynamics. Protocol action grammar (PAG) is employed to ensure the reliability and efficiency of this process. Under this system, we further analyze the existence and convergence behavior in terms of a Stackelberg equilibrium by studying the learning dynamics of LLM-empowered unified policies in response to changing followers. Simulations corroborate that our framework achieves a 77.6% greater throughput and a 65.2% fairness improvement over conventional baselines. Besides, our framework generalizes excellently to a fluctuating number of users without requiring retraining or architectural changes.

Paper Structure

This paper contains 35 sections, 7 theorems, 35 equations, 11 figures, 5 tables, 1 algorithm.

Key Result

Theorem 1

In our LLM-driven MFSG system -- where BS acts as the leader and a dynamic number of UEs act as followers -- an ESE policy profile $(\theta^\ast_b,\bm{\theta}^\ast_u)$ is guaranteed to exist.

Figures (11)

  • Figure 1: Scenario of interest: A BS serves dynamic UEs within an OFDMA system, UEs select RBGs for dPDU transmission, coordinating with the BS via UCMs and DCMs.
  • Figure 2: The workflow of our LLM-empowered MFSG within a TTI.
  • Figure 3: LLM policy action generation with PAG.
  • Figure 4: Performance comparison of throughput and fairness index under various packet arrival rates and network sizes, where $p_a$ of all UEs is set to the same value for each case.
  • Figure 5: Convergence of normalized system-wide utility during training in evaluation environments with varying UE numbers.
  • ...and 6 more figures

Theorems & Definitions (10)

  • Definition 1: Expected Stackelberg Equilibrium [ESE]
  • Definition 2: Differential Stackelberg Equilibrium [DSE] Stackel-dynamics
  • Definition 3: Policy Update Dynamics
  • Theorem 1: Existence Theorem
  • Theorem 2: Converge Theorem
  • Corollary 1
  • Lemma 1: Weierstrass Theorem Weierstrass
  • Lemma 2: Implicit Function Theorem IFT
  • Lemma 3: Theorem 1.20 of Ref. horn2005basic
  • Lemma 4