Table of Contents
Fetching ...

Open Agent Specification (Agent Spec): A Unified Representation for AI Agents

Soufiane Amini, Yassine Benajiba, Cesare Bernardis, Paul Cayet, Hassan Chafi, Abderrahim Fathan, Louis Faucon, Damien Hilloulin, Sungpack Hong, Ingo Kossyk, Tran Minh Son Le, Rhicheek Patra, Sujith Ravi, Jonas Schweizer, Jyotika Singh, Shailender Singh, Weiyi Sun, Kartik Talamadupula, Jerry Xu

TL;DR

Agent Spec introduces a framework-agnostic, declarative Open Agent Specification that unifies the definition of AI agents and their workflows. By formalizing core components, IO semantics, and execution flows, it enables agents to be authored once and run across multiple runtimes via adapters, while providing a standardized evaluation harness to compare performance, robustness, and efficiency. The authors accompany the spec with PyAgentSpec, a reference runtime WayFlow, and adapters for LangGraph, AutoGen, and CrewAI, and demonstrate cross-runtime reproducibility over three benchmarks. This work lays a foundation for portable, reusable, and rigorously evaluated agentic systems, with potential extensions in memory, planning, datastores, and remote agents to support enterprise-scale deployments.

Abstract

The proliferation of agent frameworks has led to fragmentation in how agents are defined, executed, and evaluated. Existing systems differ in their abstractions, data flow semantics, and tool integrations, making it difficult to share or reproduce workflows. We introduce Open Agent Specification (Agent Spec), a declarative language that defines AI agents and agentic workflows in a way that is compatible across frameworks, promoting reusability, portability and interoperability of AI agents. Agent Spec defines a common set of components, control and data flow semantics, and schemas that allow an agent to be defined once and executed across different runtimes. Agent Spec also introduces a standardized Evaluation harness to assess agent behavior and agentic workflows across runtimes - analogous to how HELM and related harnesses standardized LLM evaluation - so that performance, robustness, and efficiency can be compared consistently across frameworks. We demonstrate this using four distinct runtimes (LangGraph, CrewAI, AutoGen, and WayFlow) evaluated over three different benchmarks (SimpleQA Verified, $τ^2$-Bench and BIRD-SQL). We provide accompanying toolsets: a Python SDK (PyAgentSpec), a reference runtime (WayFlow), and adapters for popular frameworks (e.g., LangGraph, AutoGen, CrewAI). Agent Spec bridges the gap between model-centric and agent-centric standardization & evaluation, laying the groundwork for reliable, reusable, and portable agentic systems.

Open Agent Specification (Agent Spec): A Unified Representation for AI Agents

TL;DR

Agent Spec introduces a framework-agnostic, declarative Open Agent Specification that unifies the definition of AI agents and their workflows. By formalizing core components, IO semantics, and execution flows, it enables agents to be authored once and run across multiple runtimes via adapters, while providing a standardized evaluation harness to compare performance, robustness, and efficiency. The authors accompany the spec with PyAgentSpec, a reference runtime WayFlow, and adapters for LangGraph, AutoGen, and CrewAI, and demonstrate cross-runtime reproducibility over three benchmarks. This work lays a foundation for portable, reusable, and rigorously evaluated agentic systems, with potential extensions in memory, planning, datastores, and remote agents to support enterprise-scale deployments.

Abstract

The proliferation of agent frameworks has led to fragmentation in how agents are defined, executed, and evaluated. Existing systems differ in their abstractions, data flow semantics, and tool integrations, making it difficult to share or reproduce workflows. We introduce Open Agent Specification (Agent Spec), a declarative language that defines AI agents and agentic workflows in a way that is compatible across frameworks, promoting reusability, portability and interoperability of AI agents. Agent Spec defines a common set of components, control and data flow semantics, and schemas that allow an agent to be defined once and executed across different runtimes. Agent Spec also introduces a standardized Evaluation harness to assess agent behavior and agentic workflows across runtimes - analogous to how HELM and related harnesses standardized LLM evaluation - so that performance, robustness, and efficiency can be compared consistently across frameworks. We demonstrate this using four distinct runtimes (LangGraph, CrewAI, AutoGen, and WayFlow) evaluated over three different benchmarks (SimpleQA Verified, -Bench and BIRD-SQL). We provide accompanying toolsets: a Python SDK (PyAgentSpec), a reference runtime (WayFlow), and adapters for popular frameworks (e.g., LangGraph, AutoGen, CrewAI). Agent Spec bridges the gap between model-centric and agent-centric standardization & evaluation, laying the groundwork for reliable, reusable, and portable agentic systems.

Paper Structure

This paper contains 59 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Position of Agent Spec within the emerging agent-standardization stack. MCP governs resource and data provisioning, A2A/ACP specify inter-agent communication, and Agent Spec defines the declarative layer for agent behavior and execution semantics.
  • Figure 2: Agent Spec's Design.
  • Figure 3: Illustration of input/output exposure in nested components.
  • Figure 4: Control- and data-flow relationships in Agent Spec. Solid lines denote control-flow transitions, while dotted lines indicate data propagation.