Exact Functional ANOVA Decomposition for Categorical Inputs Models

Baptiste Ferrere; Nicolas Bousquet; Fabrice Gamboa; Jean-Michel Loubes; Joseph Muré

Exact Functional ANOVA Decomposition for Categorical Inputs Models

Baptiste Ferrere, Nicolas Bousquet, Fabrice Gamboa, Jean-Michel Loubes, Joseph Muré

TL;DR

This work bridges functional analysis with the extension of discrete Fourier analysis and derives a closed-form decomposition without any assumption for categorical inputs, which seamlessly recovers the classical independent case and extends to arbitrary dependence structures, including distributions with non-rectangular support.

Abstract

Functional ANOVA offers a principled framework for interpretability by decomposing a model's prediction into main effects and higher-order interactions. For independent features, this decomposition is well-defined, strongly linked with SHAP values, and serves as a cornerstone of additive explainability. However, the lack of an explicit closed-form expression for general dependent distributions has forced practitioners to rely on costly sampling-based approximations. We completely resolve this limitation for categorical inputs. By bridging functional analysis with the extension of discrete Fourier analysis, we derive a closed-form decomposition without any assumption. Our formulation is computationally very efficient. It seamlessly recovers the classical independent case and extends to arbitrary dependence structures, including distributions with non-rectangular support. Furthermore, leveraging the intrinsic link between SHAP and ANOVA under independence, our framework yields a natural generalization of SHAP values for the general categorical setting.

Exact Functional ANOVA Decomposition for Categorical Inputs Models

TL;DR

Abstract

Paper Structure (43 sections, 9 theorems, 67 equations, 2 figures, 8 tables, 1 algorithm)

This paper contains 43 sections, 9 theorems, 67 equations, 2 figures, 8 tables, 1 algorithm.

Introduction
Our Contributions.
Background
Notations.
Generalized Functional ANOVA hooker_2007.
Shapley Values owen_shapley_2014owen_shapley_2017.
Main Theoretical Contribution
Closed-Form Representation
Fourier Representation to Functional ANOVA
Linear Problem Formulation
Computing the Decomposition
Scalable decomposition under $r$-sparsity assumption.
Identifiability and Feature Correlation.
Example.
Rank-Based Construction.
...and 28 more sections

Key Result

Theorem 3.2

Any function $f \in L^2$ admits a Fourier expansion of the form where the set of coefficients $\{c_A^{(\mathbf z)}(f)\}$ solves the linear problem formulated in Section sec:linear. Moreover, the collection of functions defined by satisfies the functional ANOVA formulation eq:def_anova and the hierarchical orthogonality condition eq:orthogonality.

Figures (2)

Figure 1: ANOVA-Based Shapley Values on the Binarized MNIST Dataset. We apply our framework on a MLP trained on binarized MNIST, encoded as a tabular dataset of shape $(60\,000 , 784)$. The attribution targets the predicted probability of the specific class '3', defined as $f(\mathbf{x}) \coloneqq \mathbb{P}( MLP(\mathbf{x}) = 3 )$. (Left) The original input sample (digit '8'). (Middle) Signed local attributions: red pixels increase the probability of the target class '3', while blue pixels decrease it. (Right) Absolute attribution magnitudes. We observe a behavior consistent with the nature of the digits. The pixels on the right side, which overlap with the shape of a '3', positively contribute to the target probability (red). Conversely, the pixels on the left side close the loops, acting as the distinguishing features between an '8' and a '3'; these pixels correctly penalize the target probability (blue), allowing the model to rule out class '3'.
Figure 2: Global Feature Importance on Mushrooms dataset.

Theorems & Definitions (19)

Definition 3.1
Theorem 3.2
Remark 3.3
Corollary 3.4
Corollary 3.5
Remark 3.6
Definition 3.7
Remark 3.8
Definition 3.9
Proposition 3.10
...and 9 more

Exact Functional ANOVA Decomposition for Categorical Inputs Models

TL;DR

Abstract

Exact Functional ANOVA Decomposition for Categorical Inputs Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (19)