Augmented Runtime Collaboration for Self-Organizing Multi-Agent Systems: A Hybrid Bi-Criteria Routing Approach
Qingwen Yang, Feiyu Qu, Tiezheng Guo, Yanyi Liu, Yingyou Wen
TL;DR
This work tackles the challenge of task planning in open, decentralized multi-agent systems by introducing BiRouter, a runtime, local-information routing method. BiRouter uses a dual-criteria scoring scheme (ImpScore for long-term relevance and GapScore for contextual cohesion) combined with a dynamic agent-reputation mechanism to perform probabilistic successor selection, enabling emergent, globally coherent task pathways without a central planner. A large cross-domain dataset (MARS) was created to train the scoring functions, and BiRouter demonstrated superior performance, token efficiency, and robustness in both centralized and SO-MAS settings, including scenarios with unreliable agents. The approach promises scalable, adaptive collaboration for LLM-powered MAS in open environments and lays groundwork for online adaptive optimization and more complex collaboration patterns.
Abstract
LLM-based multi-agent systems have demonstrated significant capabilities across diverse domains. However, the task performance and efficiency are fundamentally constrained by their collaboration strategies. Prevailing approaches rely on static topologies and centralized global planning, a paradigm that limits their scalability and adaptability in open, decentralized networks. Effective collaboration planning in distributed systems using only local information thus remains a formidable challenge. To address this, we propose BiRouter, a novel dual-criteria routing method for Self-Organizing Multi-Agent Systems (SO-MAS). This method enables each agent to autonomously execute ``next-hop'' task routing at runtime, relying solely on local information. Its core decision-making mechanism is predicated on balancing two metrics: (1) the ImpScore, which evaluates a candidate agent's long-term importance to the overall goal, and (2) the GapScore, which assesses its contextual continuity for the current task state. Furthermore, we introduce a dynamically updated reputation mechanism to bolster system robustness in untrustworthy environments and have developed a large-scale, cross-domain dataset, comprising thousands of annotated task-routing paths, to enhance the model's generalization. Extensive experiments demonstrate that BiRouter achieves superior performance and token efficiency over existing baselines, while maintaining strong robustness and effectiveness in information-limited, decentralized, and untrustworthy settings.
