Safety and Risk Pathways in Cooperative Generative Multi-Agent Systems: A Telecom Perspective
Zeinab Nezami, Shehr Bano, Abdelaziz Salama, Maryam Hafeez, Syed Ali Raza Zaidi
TL;DR
This work tackles safety challenges in cooperative generative multi-agent systems for telecom by focusing on miscoordination and semantic drift arising from asynchronous, layered architectures. It introduces a modular safety evaluation framework that pairs agent-level checks (static, policy, and runtime) with system-level metrics, aided by knowledge integration via RAG and GraphRAG and reinforced by episodic memory. Through a telecom-grounded GMAS prototype in O-RAN/RIC and an extensive evaluation across $32$ persona configurations, $5$ questions, and $5$ runs, the authors show progressive improvements in analyzer penalties and embedding stability, while also revealing persistent risks under certain persona mixes. The paper contributes an instantiation of a domain-specific GMAS safety framework, empirical insights into persona design and coding style effects on stability, and open-source resources to support reproducibility and broader adoption. Collectively, this work offers a structured approach for auditing and governance of telecom GMAS, supporting safer deployment in critical infrastructure environments.
Abstract
Generative multiagent systems are rapidly emerging as transformative tools for scalable automation and adaptive decisionmaking in telecommunications. Despite their promise, these systems introduce novel risks that remain underexplored, particularly when agents operate asynchronously across layered architectures. This paper investigates key safety pathways in telecomfocused Generative MultiAgent Systems (GMAS), emphasizing risks of miscoordination and semantic drift shaped by persona diversity. We propose a modular safety evaluation framework that integrates agentlevel checks on code quality and compliance with systemlevel safety metrics. Using controlled simulations across 32 persona sets, five questions, and multiple iterative runs, we demonstrate progressive improvements in analyzer penalties and AllocatorCoder consistency, alongside persistent vulnerabilities such as policy drift and variability under specific persona combinations. Our findings provide the first domaingrounded evidence that persona design, coding style, and planning orientation directly influence the stability and safety of telecom GMAS, highlighting both promising mitigation strategies and open risks for future deployment.
