Table of Contents
Fetching ...

Exact Functional ANOVA Decomposition for Categorical Inputs Models

Baptiste Ferrere, Nicolas Bousquet, Fabrice Gamboa, Jean-Michel Loubes, Joseph Muré

TL;DR

This work bridges functional analysis with the extension of discrete Fourier analysis and derives a closed-form decomposition without any assumption for categorical inputs, which seamlessly recovers the classical independent case and extends to arbitrary dependence structures, including distributions with non-rectangular support.

Abstract

Functional ANOVA offers a principled framework for interpretability by decomposing a model's prediction into main effects and higher-order interactions. For independent features, this decomposition is well-defined, strongly linked with SHAP values, and serves as a cornerstone of additive explainability. However, the lack of an explicit closed-form expression for general dependent distributions has forced practitioners to rely on costly sampling-based approximations. We completely resolve this limitation for categorical inputs. By bridging functional analysis with the extension of discrete Fourier analysis, we derive a closed-form decomposition without any assumption. Our formulation is computationally very efficient. It seamlessly recovers the classical independent case and extends to arbitrary dependence structures, including distributions with non-rectangular support. Furthermore, leveraging the intrinsic link between SHAP and ANOVA under independence, our framework yields a natural generalization of SHAP values for the general categorical setting.

Exact Functional ANOVA Decomposition for Categorical Inputs Models

TL;DR

This work bridges functional analysis with the extension of discrete Fourier analysis and derives a closed-form decomposition without any assumption for categorical inputs, which seamlessly recovers the classical independent case and extends to arbitrary dependence structures, including distributions with non-rectangular support.

Abstract

Functional ANOVA offers a principled framework for interpretability by decomposing a model's prediction into main effects and higher-order interactions. For independent features, this decomposition is well-defined, strongly linked with SHAP values, and serves as a cornerstone of additive explainability. However, the lack of an explicit closed-form expression for general dependent distributions has forced practitioners to rely on costly sampling-based approximations. We completely resolve this limitation for categorical inputs. By bridging functional analysis with the extension of discrete Fourier analysis, we derive a closed-form decomposition without any assumption. Our formulation is computationally very efficient. It seamlessly recovers the classical independent case and extends to arbitrary dependence structures, including distributions with non-rectangular support. Furthermore, leveraging the intrinsic link between SHAP and ANOVA under independence, our framework yields a natural generalization of SHAP values for the general categorical setting.
Paper Structure (43 sections, 9 theorems, 67 equations, 2 figures, 8 tables, 1 algorithm)

This paper contains 43 sections, 9 theorems, 67 equations, 2 figures, 8 tables, 1 algorithm.

Key Result

Theorem 3.2

Any function $f \in L^2$ admits a Fourier expansion of the form where the set of coefficients $\{c_A^{(\mathbf z)}(f)\}$ solves the linear problem formulated in Section sec:linear. Moreover, the collection of functions defined by satisfies the functional ANOVA formulation eq:def_anova and the hierarchical orthogonality condition eq:orthogonality.

Figures (2)

  • Figure 1: ANOVA-Based Shapley Values on the Binarized MNIST Dataset. We apply our framework on a MLP trained on binarized MNIST, encoded as a tabular dataset of shape $(60\,000 , 784)$. The attribution targets the predicted probability of the specific class '3', defined as $f(\mathbf{x}) \coloneqq \mathbb{P}( MLP(\mathbf{x}) = 3 )$. (Left) The original input sample (digit '8'). (Middle) Signed local attributions: red pixels increase the probability of the target class '3', while blue pixels decrease it. (Right) Absolute attribution magnitudes. We observe a behavior consistent with the nature of the digits. The pixels on the right side, which overlap with the shape of a '3', positively contribute to the target probability (red). Conversely, the pixels on the left side close the loops, acting as the distinguishing features between an '8' and a '3'; these pixels correctly penalize the target probability (blue), allowing the model to rule out class '3'.
  • Figure 2: Global Feature Importance on Mushrooms dataset.

Theorems & Definitions (19)

  • Definition 3.1
  • Theorem 3.2
  • Remark 3.3
  • Corollary 3.4
  • Corollary 3.5
  • Remark 3.6
  • Definition 3.7
  • Remark 3.8
  • Definition 3.9
  • Proposition 3.10
  • ...and 9 more