Table of Contents
Fetching ...

AgentXRay: White-Boxing Agentic Systems via Workflow Reconstruction

Ruijie Shi, Houbin Zhang, Yuecheng Han, Yuheng Wang, Jingru Fan, Runde Yang, Yufan Dang, Huatao Li, Dewen Liu, Yuan Cheng, Chen Qian

TL;DR

This work tackles the opacity of agentic systems by addressing interpretability via Agentic Workflow Reconstruction (AWR), which aims to synthesize an explicit white-box surrogate workflow from input–output data. It introduces AgentXRay, an MCTS-based framework that searches a unified primitive space of agent/tool primitives and applies a dynamic Red-Black Pruning strategy to manage the combinatorial explosion. Across five diverse domains, AgentXRay achieves higher proxy fidelity (Static Functional Equivalence) than behavior cloning and unpruned baselines, while significantly improving search efficiency and enabling deeper exploration under fixed budgets. The results demonstrate that editable, interpretable workflows can approximate complex black-box agentic systems, offering a practical path toward transparency, debugging, and reuse, with open questions about richer workflow graphs and evaluator design for broader domains.

Abstract

Large Language Models have shown strong capabilities in complex problem solving, yet many agentic systems remain difficult to interpret and control due to opaque internal workflows. While some frameworks offer explicit architectures for collaboration, many deployed agentic systems operate as black boxes to users. We address this by introducing Agentic Workflow Reconstruction (AWR), a new task aiming to synthesize an explicit, interpretable stand-in workflow that approximates a black-box system using only input--output access. We propose AgentXRay, a search-based framework that formulates AWR as a combinatorial optimization problem over discrete agent roles and tool invocations in a chain-structured workflow space. Unlike model distillation, AgentXRay produces editable white-box workflows that match target outputs under an observable, output-based proxy metric, without accessing model parameters. To navigate the vast search space, AgentXRay employs Monte Carlo Tree Search enhanced by a scoring-based Red-Black Pruning mechanism, which dynamically integrates proxy quality with search depth. Experiments across diverse domains demonstrate that AgentXRay achieves higher proxy similarity and reduces token consumption compared to unpruned search, enabling deeper workflow exploration under fixed iteration budgets.

AgentXRay: White-Boxing Agentic Systems via Workflow Reconstruction

TL;DR

This work tackles the opacity of agentic systems by addressing interpretability via Agentic Workflow Reconstruction (AWR), which aims to synthesize an explicit white-box surrogate workflow from input–output data. It introduces AgentXRay, an MCTS-based framework that searches a unified primitive space of agent/tool primitives and applies a dynamic Red-Black Pruning strategy to manage the combinatorial explosion. Across five diverse domains, AgentXRay achieves higher proxy fidelity (Static Functional Equivalence) than behavior cloning and unpruned baselines, while significantly improving search efficiency and enabling deeper exploration under fixed budgets. The results demonstrate that editable, interpretable workflows can approximate complex black-box agentic systems, offering a practical path toward transparency, debugging, and reuse, with open questions about richer workflow graphs and evaluator design for broader domains.

Abstract

Large Language Models have shown strong capabilities in complex problem solving, yet many agentic systems remain difficult to interpret and control due to opaque internal workflows. While some frameworks offer explicit architectures for collaboration, many deployed agentic systems operate as black boxes to users. We address this by introducing Agentic Workflow Reconstruction (AWR), a new task aiming to synthesize an explicit, interpretable stand-in workflow that approximates a black-box system using only input--output access. We propose AgentXRay, a search-based framework that formulates AWR as a combinatorial optimization problem over discrete agent roles and tool invocations in a chain-structured workflow space. Unlike model distillation, AgentXRay produces editable white-box workflows that match target outputs under an observable, output-based proxy metric, without accessing model parameters. To navigate the vast search space, AgentXRay employs Monte Carlo Tree Search enhanced by a scoring-based Red-Black Pruning mechanism, which dynamically integrates proxy quality with search depth. Experiments across diverse domains demonstrate that AgentXRay achieves higher proxy similarity and reduces token consumption compared to unpruned search, enabling deeper workflow exploration under fixed iteration budgets.
Paper Structure (20 sections, 7 equations, 4 figures, 3 tables)

This paper contains 20 sections, 7 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: The concept of AWR. Given a black-box system $\mathcal{M}_{\text{black}}$ producing output $o^*$ from input $\tau$, the goal is to synthesize an explicit, interpretable white-box workflow $\mathcal{W}^*$ (e.g., a sequence of specialized agents) that matches the target outputs under observable outputs, using only input-output pairs.
  • Figure 2: Overview of the AgentXRay framework. The process takes task inputs and black-box outputs, searches for a high-scoring primitive sequence via MCTS with Red-Black Pruning, and returns an interpretable white-box workflow.
  • Figure 3: Cost-Efficiency analysis across five domains. The horizontal axis denotes reconstruction similarity (higher is better), while the vertical axis represents token consumption in millions (lower is better). Our method achieves comparable or higher fidelity than unpruned variants but with reduced computational overhead.
  • Figure 4: Convergence analysis on ChatDev and SCI. Pruned variants converge earlier and faster, reaching strong candidates at lower token cost than unpruned search.