Agent Context Protocols Enhance Collective Inference
Devansh Bhardwaj, Arjun Beniwal, Shreyas Chaudhari, Ashwin Kalyan, Tanmay Rajpurohit, Karthik R. Narasimhan, Ameet Deshpande, Vishvak Murahari
TL;DR
This work addresses interoperability gaps in multi-agent systems by introducing Agent Context Protocols (ACPs), which pair a persistent Execution Blueprint DAG with standardized inter-agent messages to enable fault-tolerant, long-horizon collective inference. ACPs formalize agent capabilities, task decomposition, and tool orchestration within a domain-agnostic framework, leveraging messages like AGENT_REQUEST, AGENT_RESPONSE, and ASSISTANCE_REQUEST and a suite of standardized error codes. Empirical results show state-of-the-art performance on AssistantBench (28.3% accuracy with domain tools) and best-in-class multimodal report generation, along with ablations highlighting the importance of coordination and fault tolerance. The approach offers a modular, extensible foundation for rapid construction of generalist, domain-adaptable agents with robust error handling and interoperability across diverse tasks.
Abstract
AI agents have become increasingly adept at complex tasks such as coding, reasoning, and multimodal understanding. However, building generalist systems requires moving beyond individual agents to collective inference -- a paradigm where multi-agent systems with diverse, task-specialized agents complement one another through structured communication and collaboration. Today, coordination is usually handled with imprecise, ad-hoc natural language, which limits complex interaction and hinders interoperability with domain-specific agents. We introduce Agent context protocols (ACPs): a domain- and agent-agnostic family of structured protocols for agent-agent communication, coordination, and error handling. ACPs combine (i) persistent execution blueprints -- explicit dependency graphs that store intermediate agent outputs -- with (ii) standardized message schemas, enabling robust and fault-tolerant multi-agent collective inference. ACP-powered generalist systems reach state-of-the-art performance: 28.3 % accuracy on AssistantBench for long-horizon web assistance and best-in-class multimodal technical reports, outperforming commercial AI systems in human evaluation. ACPs are highly modular and extensible, allowing practitioners to build top-tier generalist agents quickly.
