XAgen: An Explainability Tool for Identifying and Correcting Failures in Multi-Agent Workflows

Xinru Wang; Ming Yin; Eunyee Koh; Mustafa Doga Dogan

XAgen: An Explainability Tool for Identifying and Correcting Failures in Multi-Agent Workflows

Xinru Wang, Ming Yin, Eunyee Koh, Mustafa Doga Dogan

TL;DR

XAgen is an explainability tool that supports users with varying AI expertise through three core capabilities: log visualization for glanceable workflow understanding, human-in-the-loop feedback to capture expert judgment, and automatic error detection via an LLM-as-a-judge.

Abstract

As multi-agent systems powered by Large Language Models (LLMs) are increasingly adopted in real-world workflows, users with diverse technical backgrounds are now building and refining their own agentic processes. However, these systems can fail in opaque ways, making it difficult for users to observe, understand, and correct errors. We conducted formative interviews with 12 practitioners to identify mismatches between existing debugging tools and users' needs. Based on these insights, we designed XAgen, an explainability tool that supports users with varying AI expertise through three core capabilities: log visualization for glanceable workflow understanding, human-in-the-loop feedback to capture expert judgment, and automatic error detection via an LLM-as-a-judge. In a user study with 8 participants, XAgen helped users locate failures more easily, attribute to specific agents or steps, and iteratively improve configurations. Our findings surface human-centered design guidelines for explainable agentic AI development and highlight opportunities for more context-aware interactive debugging.

XAgen: An Explainability Tool for Identifying and Correcting Failures in Multi-Agent Workflows

TL;DR

Abstract

XAgen: An Explainability Tool for Identifying and Correcting Failures in Multi-Agent Workflows

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)