AURA: Adaptive Unified Reasoning and Automation with LLM-Guided MARL for NextG Cellular Networks
Narjes Nourzad, Mingyu Zong, Bhaskar Krishnamachari
TL;DR
6G/NextG networks demand scalable, low-latency decision-making, while LLMs provide strategic planning at the cost of latency and MARL offers localized adaptability with coordination challenges. AURA fuses cloud-based high-level planning with base-station MARL agents, employing a trust mechanism, batched communication, and a Centralized Alignment Controller to align global goals with local actions. The study shows that LLM-guided MARL reduces dropped handoff requests and system failures, achieves these gains with modest LLM usage, and preserves local adaptability to mitigate latency and hallucination risks. Overall, AURA offers a scalable framework that leverages LLM reasoning alongside MARL adaptability for real-time NextG network management.
Abstract
Next-generation (NextG) cellular networks are expected to manage dynamic traffic while sustaining high performance. Large language models (LLMs) provide strategic reasoning for 6G planning, but their computational cost and latency limit real-time use. Multi-agent reinforcement learning (MARL) supports localized adaptation, yet coordination at scale remains challenging. We present AURA, a framework that integrates cloud-based LLMs for high-level planning with base stations modeled as MARL agents for local decision-making. The LLM generates objectives and subgoals from its understanding of the environment and reasoning capabilities, while agents at base stations execute these objectives autonomously, guided by a trust mechanism that balances local learning with external input. To reduce latency, AURA employs batched communication so that agents update the LLM's view of the environment and receive improved feedback. In a simulated 6G scenario, AURA improves resilience, reducing dropped handoff requests by more than half under normal and high traffic and lowering system failures. Agents use LLM input in fewer than 60\% of cases, showing that guidance augments rather than replaces local adaptability, thereby mitigating latency and hallucination risks. These results highlight the promise of combining LLM reasoning with MARL adaptability for scalable, real-time NextG network management.
