Understanding and Optimizing Agentic Workflows via Shapley value
Yingxuan Yang, Bo Huang, Siyuan Qi, Chao Feng, Haoyi Hu, Yuxuan Zhu, Jinbo Hu, Haoran Zhao, Ziyi He, Xiao Liu, Muning Wen, Zongyu Wang, Lin Qiu, Xuezhi Cao, Xunliang Cai, Yong Yu, Weinan Zhang
TL;DR
ShapleyFlow introduces a principled, game-theoretic framework to analyze and optimize agentic workflows by treating workflow components as cooperative players and applying the Shapley value to attribute performance across all coalitions. By evaluating 2^n configurations, the framework (a) provides fine-grained component attribution, (b) uncovers synergistic interactions, and (c) enables task-specific optimal workflow discovery. The authors build CapaBench, a benchmark of over 1,500 tasks across seven domains, and demonstrate that task-specific configurations consistently outperform single-LLM baselines across domains. They show robustness of attribution to baseline model choice and validate Shapley-based insights against independent evaluations, delivering actionable design guidelines for domain-general workflow optimization. Overall, ShapleyFlow shifts workflow evaluation from black-box end-task performance toward principled, interpretable optimization of multi-component AI systems.
Abstract
Agentic workflows have become the dominant paradigm for building complex AI systems, orchestrating specialized components, such as planning, reasoning, action execution, and reflection, to tackle sophisticated real-world tasks. However, systematically analyzing and optimizing these workflows remains challenging due to intricate component interdependencies and the lack of principled attribution methods. In this work, we introduce ShapleyFlow, the first framework that employs cooperative game theory to analyze and optimize agentic workflows. By applying the Shapley value to evaluate all possible component configurations, ShapleyFlow enables fine-grained attribution of each component's contribution and facilitates the identification of task-specific optimal configurations. Through a constructed dataset evaluated across 7 scenarios, such as navigation, math and OS, we demonstrate 3 key contributions: (1) Theoretical Framework: a principled game-theoretic approach for the attribution of contributions in agentic workflows. (2) Optimal Workflow Discovery: ShapleyFlow identifies task-specific component configurations that consistently outperform workflows relying on a single LLM across all tested tasks. (3) Comprehensive Analysis: we construct and analyze over 1,500 tasks, providing actionable insights and design guidelines for optimizing workflows across multiple domains.
