Table of Contents
Fetching ...

DIG to Heal: Scaling General-purpose Agent Collaboration via Explainable Dynamic Decision Paths

Hanqing Yang, Hyungwoo Lee, Yuhang Yao, Zhiwei Liu, Kay Liu, Jingdi Chen, Carlee Joe-Wong

TL;DR

The Dynamic Interaction Graph (DIG) is introduced, which captures emergent collaboration as a time-evolving causal network of agent activations and interactions, enabling real-time identification, explanation, and correction of collaboration-induced error patterns directly from agents' collaboration paths.

Abstract

The increasingly popular agentic AI paradigm promises to harness the power of multiple, general-purpose large language model (LLM) agents to collaboratively complete complex tasks. While many agentic AI systems utilize predefined workflows or agent roles in order to reduce complexity, ideally these agents would be truly autonomous, able to achieve emergent collaboration even as the number of collaborating agents increases. Yet in practice, such unstructured interactions can lead to redundant work and cascading failures that are difficult to interpret or correct. In this work, we study multi-agent systems composed of general-purpose LLM agents that operate without predefined roles, control flow, or communication constraints, relying instead on emergent collaboration to solve problems. We introduce the Dynamic Interaction Graph (DIG), which captures emergent collaboration as a time-evolving causal network of agent activations and interactions. DIG makes emergent collaboration observable and explainable for the first time, enabling real-time identification, explanation, and correction of collaboration-induced error patterns directly from agents' collaboration paths. Thus, DIG fills a critical gap in understanding how general LLM agents solve problems together in truly agentic multi-agent systems. The project webpage can be found at: https://happyeureka.github.io/dig.

DIG to Heal: Scaling General-purpose Agent Collaboration via Explainable Dynamic Decision Paths

TL;DR

The Dynamic Interaction Graph (DIG) is introduced, which captures emergent collaboration as a time-evolving causal network of agent activations and interactions, enabling real-time identification, explanation, and correction of collaboration-induced error patterns directly from agents' collaboration paths.

Abstract

The increasingly popular agentic AI paradigm promises to harness the power of multiple, general-purpose large language model (LLM) agents to collaboratively complete complex tasks. While many agentic AI systems utilize predefined workflows or agent roles in order to reduce complexity, ideally these agents would be truly autonomous, able to achieve emergent collaboration even as the number of collaborating agents increases. Yet in practice, such unstructured interactions can lead to redundant work and cascading failures that are difficult to interpret or correct. In this work, we study multi-agent systems composed of general-purpose LLM agents that operate without predefined roles, control flow, or communication constraints, relying instead on emergent collaboration to solve problems. We introduce the Dynamic Interaction Graph (DIG), which captures emergent collaboration as a time-evolving causal network of agent activations and interactions. DIG makes emergent collaboration observable and explainable for the first time, enabling real-time identification, explanation, and correction of collaboration-induced error patterns directly from agents' collaboration paths. Thus, DIG fills a critical gap in understanding how general LLM agents solve problems together in truly agentic multi-agent systems. The project webpage can be found at: https://happyeureka.github.io/dig.
Paper Structure (26 sections, 1 theorem, 18 equations, 16 figures, 5 tables)

This paper contains 26 sections, 1 theorem, 18 equations, 16 figures, 5 tables.

Key Result

Theorem 4.1

Fix a class $\mathcal{C}$ of cooperative multi-agent systems whose observable semantics are defined by asynchronous activations and event passing as in Section sec:problem_formulation. Let $\mathcal{T}$ be the space of observable execution traces and let $\Psi:\mathcal{T}\to\{G(t)\}_{t\in\mathbb{Z}}

Figures (16)

  • Figure 1: Cooperative problem solving with general-purpose agents. A pool of autonomous agents works on a shared task without predefined roles, control flow, or communication constraints, each operating independently and non-deterministically and interacting through emergent cooperative strategies. We make emergent cooperation analyzable by modeling agent interaction in a protocol-agnostic manner.
  • Figure 2: Dynamic Interaction Graph (DIG) and local graph rewrite operators. (1-2) Bipartite graph structure with agent activation nodes (circles) and event nodes (rectangles). Each activation node may consume a set of input events and generate a set of output events. Each event has a single source activation and may be delivered to multiple downstream agents. (3) Canonical edge rewrite operators RESPOND, WAIT, REROUTE, DISCARD, and SUBMIT, defining how activations transform incoming delivery edges and produce new events. (4) Raw DIG induced by an execution trace, including all generation and delivery edges. (5) Cleaned DIG after removing non-productive edges corresponding to non-productive interactions, exposing the underlying interaction structure.
  • Figure 3: Illustration of the structural failure patterns in DIG defined in Table \ref{['tab:dig_error_taxonomy']}, showing how coordination invariant violations manifest as interaction structure.
  • Figure 4: MAS + DIG (3 agents): Real-time DIG showing agent activations, event propagation, and edge-level rewrites. The activation timeline indicates stable coordination and successful task termination.
  • Figure 5: MAS + LLM Judge: DIG trace showing unstable coordination, excessive waiting and rerouting, and delayed interventions, leading to inefficient execution and unreliable termination.
  • ...and 11 more figures

Theorems & Definitions (2)

  • Theorem 4.1: Topological Inference Reduction
  • proof