Table of Contents
Fetching ...

Understanding and Optimizing Agentic Workflows via Shapley value

Yingxuan Yang, Bo Huang, Siyuan Qi, Chao Feng, Haoyi Hu, Yuxuan Zhu, Jinbo Hu, Haoran Zhao, Ziyi He, Xiao Liu, Muning Wen, Zongyu Wang, Lin Qiu, Xuezhi Cao, Xunliang Cai, Yong Yu, Weinan Zhang

TL;DR

ShapleyFlow introduces a principled, game-theoretic framework to analyze and optimize agentic workflows by treating workflow components as cooperative players and applying the Shapley value to attribute performance across all coalitions. By evaluating 2^n configurations, the framework (a) provides fine-grained component attribution, (b) uncovers synergistic interactions, and (c) enables task-specific optimal workflow discovery. The authors build CapaBench, a benchmark of over 1,500 tasks across seven domains, and demonstrate that task-specific configurations consistently outperform single-LLM baselines across domains. They show robustness of attribution to baseline model choice and validate Shapley-based insights against independent evaluations, delivering actionable design guidelines for domain-general workflow optimization. Overall, ShapleyFlow shifts workflow evaluation from black-box end-task performance toward principled, interpretable optimization of multi-component AI systems.

Abstract

Agentic workflows have become the dominant paradigm for building complex AI systems, orchestrating specialized components, such as planning, reasoning, action execution, and reflection, to tackle sophisticated real-world tasks. However, systematically analyzing and optimizing these workflows remains challenging due to intricate component interdependencies and the lack of principled attribution methods. In this work, we introduce ShapleyFlow, the first framework that employs cooperative game theory to analyze and optimize agentic workflows. By applying the Shapley value to evaluate all possible component configurations, ShapleyFlow enables fine-grained attribution of each component's contribution and facilitates the identification of task-specific optimal configurations. Through a constructed dataset evaluated across 7 scenarios, such as navigation, math and OS, we demonstrate 3 key contributions: (1) Theoretical Framework: a principled game-theoretic approach for the attribution of contributions in agentic workflows. (2) Optimal Workflow Discovery: ShapleyFlow identifies task-specific component configurations that consistently outperform workflows relying on a single LLM across all tested tasks. (3) Comprehensive Analysis: we construct and analyze over 1,500 tasks, providing actionable insights and design guidelines for optimizing workflows across multiple domains.

Understanding and Optimizing Agentic Workflows via Shapley value

TL;DR

ShapleyFlow introduces a principled, game-theoretic framework to analyze and optimize agentic workflows by treating workflow components as cooperative players and applying the Shapley value to attribute performance across all coalitions. By evaluating 2^n configurations, the framework (a) provides fine-grained component attribution, (b) uncovers synergistic interactions, and (c) enables task-specific optimal workflow discovery. The authors build CapaBench, a benchmark of over 1,500 tasks across seven domains, and demonstrate that task-specific configurations consistently outperform single-LLM baselines across domains. They show robustness of attribution to baseline model choice and validate Shapley-based insights against independent evaluations, delivering actionable design guidelines for domain-general workflow optimization. Overall, ShapleyFlow shifts workflow evaluation from black-box end-task performance toward principled, interpretable optimization of multi-component AI systems.

Abstract

Agentic workflows have become the dominant paradigm for building complex AI systems, orchestrating specialized components, such as planning, reasoning, action execution, and reflection, to tackle sophisticated real-world tasks. However, systematically analyzing and optimizing these workflows remains challenging due to intricate component interdependencies and the lack of principled attribution methods. In this work, we introduce ShapleyFlow, the first framework that employs cooperative game theory to analyze and optimize agentic workflows. By applying the Shapley value to evaluate all possible component configurations, ShapleyFlow enables fine-grained attribution of each component's contribution and facilitates the identification of task-specific optimal configurations. Through a constructed dataset evaluated across 7 scenarios, such as navigation, math and OS, we demonstrate 3 key contributions: (1) Theoretical Framework: a principled game-theoretic approach for the attribution of contributions in agentic workflows. (2) Optimal Workflow Discovery: ShapleyFlow identifies task-specific component configurations that consistently outperform workflows relying on a single LLM across all tested tasks. (3) Comprehensive Analysis: we construct and analyze over 1,500 tasks, providing actionable insights and design guidelines for optimizing workflows across multiple domains.

Paper Structure

This paper contains 29 sections, 4 equations, 12 figures, 6 tables, 1 algorithm.

Figures (12)

  • Figure 1: ShapleyFlow framework for agentic workflow analysis and optimization. The left panel shows a typical agentic workflow with four core components (Planning, Reasoning, Action, Reflection) orchestrated through single-turn and multi-turn interactions. The middle panel illustrates our game-theoretic formulation, where workflow components are modeled as cooperative players with coalition outcomes mapped to performance scores. The right panel demonstrates exhaustive evaluation across all possible workflow configurations ($2^4 = 16$), enabling Shapley value computation for principled component attribution and optimization guidance.
  • Figure 2: A vanilla agentic workflow with 4 components.
  • Figure 3: Results of all combinations in Math (Algebra) for Claude-3.5-Sonnet under different configurations. The pattern of the bars indicates the number of components (ranging from 0 to 4) that Claude is involved in.
  • Figure 4: Planning
  • Figure 5: Reasoning
  • ...and 7 more figures