Table of Contents
Fetching ...

Axiomatic On-Manifold Shapley via Optimal Generative Flows

Cenwei Zhang, Lin Zhu, Manxi Lin, Lei You

TL;DR

A formal theory of on-manifold Aumann-Shapley attributions driven by optimal generative flows is proposed, and a representation theorem establishing the gradient line integral as the unique functional satisfying efficiency and geometric axioms, notably reparameterization invariance is proved.

Abstract

Shapley-based attribution is critical for post-hoc XAI but suffers from off-manifold artifacts due to heuristic baselines. While generative methods attempt to address this, they often introduce geometric inefficiency and discretization drift. We propose a formal theory of on-manifold Aumann-Shapley attributions driven by optimal generative flows. We prove a representation theorem establishing the gradient line integral as the unique functional satisfying efficiency and geometric axioms, notably reparameterization invariance. To resolve path ambiguity, we select the kinetic-energy-minimizing Wasserstein-2 geodesic transporting a prior to the data distribution. This yields a canonical attribution family that recovers classical Shapley for additive models and admits provable stability bounds against flow approximation errors. By reframing baseline selection as a variational problem, our method experimentally outperforms baselines, achieving strict manifold adherence via vanishing Flow Consistency Error and superior semantic alignment characterized by Structure-Aware Total Variation. Our code is on https://github.com/cenweizhang/OTFlowSHAP.

Axiomatic On-Manifold Shapley via Optimal Generative Flows

TL;DR

A formal theory of on-manifold Aumann-Shapley attributions driven by optimal generative flows is proposed, and a representation theorem establishing the gradient line integral as the unique functional satisfying efficiency and geometric axioms, notably reparameterization invariance is proved.

Abstract

Shapley-based attribution is critical for post-hoc XAI but suffers from off-manifold artifacts due to heuristic baselines. While generative methods attempt to address this, they often introduce geometric inefficiency and discretization drift. We propose a formal theory of on-manifold Aumann-Shapley attributions driven by optimal generative flows. We prove a representation theorem establishing the gradient line integral as the unique functional satisfying efficiency and geometric axioms, notably reparameterization invariance. To resolve path ambiguity, we select the kinetic-energy-minimizing Wasserstein-2 geodesic transporting a prior to the data distribution. This yields a canonical attribution family that recovers classical Shapley for additive models and admits provable stability bounds against flow approximation errors. By reframing baseline selection as a variational problem, our method experimentally outperforms baselines, achieving strict manifold adherence via vanishing Flow Consistency Error and superior semantic alignment characterized by Structure-Aware Total Variation. Our code is on https://github.com/cenweizhang/OTFlowSHAP.
Paper Structure (60 sections, 4 theorems, 35 equations, 11 figures, 4 tables)

This paper contains 60 sections, 4 theorems, 35 equations, 11 figures, 4 tables.

Key Result

Theorem 3.3

Fix a $C^1$ path $\gamma$. Consider any attribution rule $A_i(f,\gamma)$ satisfying Efficiency, Linearity, Dummy, Symmetry, Locality, and Reparameterization Invariance. Assume $f\mapsto A_i(f,\gamma)$ is continuous w.r.t. uniform convergence of $f$ and $\nabla f$. Then, for every coordinate $i$ and

Figures (11)

  • Figure 1: Overview of Canonical On-Manifold Shapley via Optimal Flows. Our framework computes the unique axiomatic attribution $\Psi$ (Def \ref{['def:canonical']}) by integrating the model gradient $\nabla_x f_c$ along the optimal transport path $\gamma^*$ (red curve). As shown by the top samples, $\gamma^*$ remains on the data manifold $p_1$ throughout the transition from the reference distribution $p_0$ to the data. Unlike heuristic methods, this path is geometrically optimal, ensuring stable and principled explanations for the target logit $f_c$.
  • Figure 2: Geometric Straightness Implies Explanation Stability.(a) Qualitative Consistency: We visualize attribution maps for the same input across distinct seeds. The Reflowed Shapley (2-RF) yields robust, structure-aligned explanations, whereas the One-Step Baseline (1-RF) exhibits minor fluctuations due to residual trajectory curvature. (b) Quantitative Correlation: A scatter plot of Transport Cost (Kinetic Energy) vs. Structural Consistency (SSIM) reveals a clear Pareto frontier. Our method (Green) clusters in the low-energy, high-stability regime, visually confirming that minimizing the kinetic action of the generative path effectively filters out stochastic instability in attributions.
  • Figure 3: Empirical Verification of Stability Bounds. Scatter plot of Relative Attribution Error vs. Flow Approximation Error across test samples. Each trajectory represents a single sample evolving from early training stages (high error) to convergence (low error). The tight linear correlation confirms that the attribution error scales predictably with model quality. This implies that improving the generative backbone yields a guaranteed, proportional gain in explanation fidelity, empirically validating Theorem 4.3.
  • Figure 4: Qualitative Visualization Results.Top (CIFAR-10): Traditional methods (IG, GradientSHAP) produce scattered noise, whereas Geodesic Flow method yields coherent object masks. Bottom (CelebA-HQ): In high dimensions, our method captures fine-grained details (e.g., beard, nose, jaw, eyes) without the over-smoothing artifacts observed in DDIM. As can be seen from the comparison, our method is more in line with visual reality.
  • Figure 5: Validation on a Synthetic Additive Model. (Top-left) Attribution identity between the analytical Shapley values (ground truth) and the straight-line path-integral estimator with midpoint quadrature ($K{=}200$), showing near-perfect alignment ($y{=}x$). (Top-right) Relative $\ell_2$ error versus integration steps $K$ (log-log), matching the expected $O(K^{-2})$ convergence of the midpoint rule. (Bottom-left) Residual histogram at $K{=}200$, exhibiting near zero-mean residuals consistent with numerical discretization error. (Bottom-right) Identity plots across different $K$, illustrating progressively tighter alignment as $K$ increases.
  • ...and 6 more figures

Theorems & Definitions (7)

  • Definition 3.1: Path-based attribution rule
  • Definition 3.2: Flow-based Aumann--Shapley attribution
  • Theorem 3.3: Uniqueness on a fixed path
  • Definition 4.1: Canonical flow-based Shapley attribution
  • Theorem 4.2: Canonicality via optimal flows
  • Theorem 4.3: Stability bounds
  • Proposition 3.1: Agreement on additive models