Generalizability vs. Counterfactual Explainability Trade-Off

Fabiano Veglianti; Flavio Giorgi; Fabrizio Silvestri; Gabriele Tolomei

Generalizability vs. Counterfactual Explainability Trade-Off

Fabiano Veglianti, Flavio Giorgi, Fabrizio Silvestri, Gabriele Tolomei

TL;DR

This work introduces the $\varepsilon$-valid counterfactual probability, $p_i^{\varepsilon}$, to quantify how easily counterfactual perturbations can flip a model's prediction within an $\varepsilon$-neighborhood of each data point. The authors establish a rigorous link between $p_i^{\varepsilon}$ and the geometry of the decision boundary, deriving exact formulas for linear models and a local approximation for non-linear models, and show that the average $\bar{p}^{\varepsilon}$ increases as margins shrink, i.e., with overfitting. They argue that $\bar{p}^{\varepsilon}$ serves as a practical proxy for model generalizability, supported by empirical evaluation on Water Potability and Air Quality datasets using logistic regression and MLPs, where unregularized models exhibit higher $\bar{p}^{\varepsilon}$. The work highlights a fundamental trade-off: models with better generalization have harder-to-find counterfactuals, while overfitted models yield more accessible counterfactual explanations, offering a quantitative lens for balancing explainability and performance.

Abstract

In this work, we investigate the relationship between model generalization and counterfactual explainability in supervised learning. We introduce the notion of $\varepsilon$-valid counterfactual probability ($\varepsilon$-VCP) -- the probability of finding perturbations of a data point within its $\varepsilon$-neighborhood that result in a label change. We provide a theoretical analysis of $\varepsilon$-VCP in relation to the geometry of the model's decision boundary, showing that $\varepsilon$-VCP tends to increase with model overfitting. Our findings establish a rigorous connection between poor generalization and the ease of counterfactual generation, revealing an inherent trade-off between generalization and counterfactual explainability. Empirical results validate our theory, suggesting $\varepsilon$-VCP as a practical proxy for quantitatively characterizing overfitting.

Generalizability vs. Counterfactual Explainability Trade-Off

TL;DR

This work introduces the

-valid counterfactual probability,

, to quantify how easily counterfactual perturbations can flip a model's prediction within an

-neighborhood of each data point. The authors establish a rigorous link between

and the geometry of the decision boundary, deriving exact formulas for linear models and a local approximation for non-linear models, and show that the average

increases as margins shrink, i.e., with overfitting. They argue that

serves as a practical proxy for model generalizability, supported by empirical evaluation on Water Potability and Air Quality datasets using logistic regression and MLPs, where unregularized models exhibit higher

. The work highlights a fundamental trade-off: models with better generalization have harder-to-find counterfactuals, while overfitted models yield more accessible counterfactual explanations, offering a quantitative lens for balancing explainability and performance.

Abstract

In this work, we investigate the relationship between model generalization and counterfactual explainability in supervised learning. We introduce the notion of

-valid counterfactual probability (

-VCP) -- the probability of finding perturbations of a data point within its

-neighborhood that result in a label change. We provide a theoretical analysis of

-VCP in relation to the geometry of the model's decision boundary, showing that

-VCP tends to increase with model overfitting. Our findings establish a rigorous connection between poor generalization and the ease of counterfactual generation, revealing an inherent trade-off between generalization and counterfactual explainability. Empirical results validate our theory, suggesting

-VCP as a practical proxy for quantitatively characterizing overfitting.

Generalizability vs. Counterfactual Explainability Trade-Off

TL;DR

Abstract

Generalizability vs. Counterfactual Explainability Trade-Off

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (15)