Curve Your Enthusiasm: Concurvity Regularization in Differentiable Generalized Additive Models
Julien Siems, Konstantin Ditschuneit, Winfried Ripken, Alma Lindborg, Maximilian Schambach, Johannes S. Otterbach, Martin Genzel
TL;DR
This work identifies concurvity as a key obstacle to the interpretability of differentiable GAMs and proposes a simple, differentiable regularizer $R_{\perp}$ that penalizes pairwise correlations among the non-linear feature mappings $f_i(X_i)$. By optimizing $\min_{\beta,(f_i)} \mathbb{E}[L(Y,\beta+\sum f_i(X_i))] + \lambda R_{\perp}(\{f_i\},\{X_i\})$, the approach decorrelates additive components, improving interpretability without severely compromising predictive accuracy. Empirical results across toy, time-series, and tabular datasets (including Neural Additive Models and NeuralProphet) show reduced concurvity, more stable feature importances, and clearer component separation, with a moderate trade-off controlled by $\lambda$. These findings suggest that decorrelation-focused regularization can enhance the reliability and transparency of GAM-based models in safety-critical and regulated settings.
Abstract
Generalized Additive Models (GAMs) have recently experienced a resurgence in popularity due to their interpretability, which arises from expressing the target value as a sum of non-linear transformations of the features. Despite the current enthusiasm for GAMs, their susceptibility to concurvity - i.e., (possibly non-linear) dependencies between the features - has hitherto been largely overlooked. Here, we demonstrate how concurvity can severly impair the interpretability of GAMs and propose a remedy: a conceptually simple, yet effective regularizer which penalizes pairwise correlations of the non-linearly transformed feature variables. This procedure is applicable to any differentiable additive model, such as Neural Additive Models or NeuralProphet, and enhances interpretability by eliminating ambiguities due to self-canceling feature contributions. We validate the effectiveness of our regularizer in experiments on synthetic as well as real-world datasets for time-series and tabular data. Our experiments show that concurvity in GAMs can be reduced without significantly compromising prediction quality, improving interpretability and reducing variance in the feature importances.
