Interpretability-by-Design with Accurate Locally Additive Models and Conditional Feature Effects

Vasilis Gkolemis; Loukas Kavouras; Dimitrios Kyriakopoulos; Konstantinos Tsopelas; Dimitrios Rontogiannis; Giuseppe Casalicchio; Theodore Dalamagas; Christos Diou

Interpretability-by-Design with Accurate Locally Additive Models and Conditional Feature Effects

Vasilis Gkolemis, Loukas Kavouras, Dimitrios Kyriakopoulos, Konstantinos Tsopelas, Dimitrios Rontogiannis, Giuseppe Casalicchio, Theodore Dalamagas, Christos Diou

TL;DR

CALM introduces Conditionally Additive Local Models to bridge GAM interpretability and GA2M accuracy by learning region-specific univariate shape functions per feature conditioned on interacting features. A three-step distillation pipeline—reference model, heterogeneity-driven region partitioning, and region-aware backfitting—identifies homogeneous regions where features act additively, while still capturing interactions. The authors prove population-level optimality for Step 3 under fixed regions and formalize interpretability properties, including local contributions, regional sensitivity, and global monotonicity. Empirically, CALM outperforms GAM baselines and matches or exceeds GA2Ms across 25 real-world datasets with far fewer interactions and competitive runtimes, demonstrating a favorable accuracy–interpretability trade-off for practical use.

Abstract

Generalized additive models (GAMs) offer interpretability through independent univariate feature effects but underfit when interactions are present in data. GA$^2$Ms add selected pairwise interactions which improves accuracy, but sacrifices interpretability and limits model auditing. We propose \emph{Conditionally Additive Local Models} (CALMs), a new model class, that balances the interpretability of GAMs with the accuracy of GA$^2$Ms. CALMs allow multiple univariate shape functions per feature, each active in different regions of the input space. These regions are defined independently for each feature as simple logical conditions (thresholds) on the features it interacts with. As a result, effects remain locally additive while varying across subregions to capture interactions. We further propose a principled distillation-based training pipeline that identifies homogeneous regions with limited interactions and fits interpretable shape functions via region-aware backfitting. Experiments on diverse classification and regression tasks show that CALMs consistently outperform GAMs and achieve accuracy comparable with GA$^2$Ms. Overall, CALMs offer a compelling trade-off between predictive accuracy and interpretability.

Interpretability-by-Design with Accurate Locally Additive Models and Conditional Feature Effects

TL;DR

Abstract

Generalized additive models (GAMs) offer interpretability through independent univariate feature effects but underfit when interactions are present in data. GA

Ms add selected pairwise interactions which improves accuracy, but sacrifices interpretability and limits model auditing. We propose \emph{Conditionally Additive Local Models} (CALMs), a new model class, that balances the interpretability of GAMs with the accuracy of GA

Ms. CALMs allow multiple univariate shape functions per feature, each active in different regions of the input space. These regions are defined independently for each feature as simple logical conditions (thresholds) on the features it interacts with. As a result, effects remain locally additive while varying across subregions to capture interactions. We further propose a principled distillation-based training pipeline that identifies homogeneous regions with limited interactions and fits interpretable shape functions via region-aware backfitting. Experiments on diverse classification and regression tasks show that CALMs consistently outperform GAMs and achieve accuracy comparable with GA

Ms. Overall, CALMs offer a compelling trade-off between predictive accuracy and interpretability.

Paper Structure (50 sections, 7 theorems, 68 equations, 3 figures, 15 tables, 4 algorithms)

This paper contains 50 sections, 7 theorems, 68 equations, 3 figures, 15 tables, 4 algorithms.

Introduction
Background and Related Work
CALM: Conditionally Additive Local Model
Model Formulation
Training algorithm for fitting CALM
Step 3: Estimate Shape Functions.
Interpretability of a CALM
P1. Local Feature Contribution:
P2. Regional Feature Sensitivity:
P3. Global Feature Property:
Efficiency
Empirical Evaluation
Synthetic example
Case 1.
Case 2.
...and 35 more sections

Key Result

Proposition 3.1

Let $m(\mathbf{x}) = \mathbb{E}[Y \mid \mathbf{X}=\mathbf{x}]$ denote the true regression function. Assume regression with squared loss, fixed partition trees $\{T_i\}_{i=1}^d$ (as learned by Step 2) and $\mathbb{E}[Y^2]<\infty$. Let $\mathcal{H}(\{T_i\})$ denote the fixed-tree CALM class (Appendix

Figures (3)

Figure 1: CALMs use conditional feature effects: every feature effect is expressed by a collection of 1D functions, each associated with a different region of the input space. In the example, the effect of $x_1$ conditions on $x_3$, the effect of $x_2$ on $x_1$, while $x_d$ does not interact with any other feature and thus has a single plot. CALMs are interpretable-by-design—summarized in $d$ figures of 1D plots—and accurate, as they can model feature interactions. Code available at: https://github.com/givasile/CALM.
Figure 2: CALM plot for $x_1$. Each curve shows the contribution of $x_1$ to $y$ (P1) in a different region of the input space ($x_3 \lessgtr 0$). Vertical lines indicate interaction-induced discontinuities, e.g., at $x_1 \approx -0.4$ there exist a positive hidden jump of $[0, 0.37]$ due to interaction of $x_1$ with $x_2$, which must be considered when assessing regional sensitivity (P2) or global properties (P3).
Figure 3: Explanatory plots for the three synthetic regression tasks: top row (Task 1) shows two-region interaction; middle row (Task 2) shows four-region interaction; bottom row (Task 3) shows general interactions.

Theorems & Definitions (13)

Proposition 3.1: Optimality and convergence
Proposition 1.1
proof
Proposition 1.2: Population optimality
proof
Proposition 1.3: Convergence of exact cyclic regional backfitting
proof
Proposition 2.1: Local Feature Contribution
proof
Proposition 2.2: Regional Feature Sensitivity)
...and 3 more

Interpretability-by-Design with Accurate Locally Additive Models and Conditional Feature Effects

TL;DR

Abstract

Interpretability-by-Design with Accurate Locally Additive Models and Conditional Feature Effects

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (13)