Table of Contents
Fetching ...

Countering Overfitting with Counterfactual Examples

Flavio Giorgi, Fabiano Veglianti, Fabrizio Silvestri, Gabriele Tolomei

TL;DR

Overfitting degrades generalization, and the authors propose CF-Reg, a regularizer that enforces margins between training examples and their counterfactuals. Grounded in margin theory, CF-Reg is designed to be compatible with any differentiable counterfactual generator and is optimized alongside empirical risk. Empirical results across multiple datasets and architectures show CF-Reg often surpasses traditional regularizers and adversarial training in generalization, while simultaneously producing counterfactual explanations as a by-product. The work highlights a principled trade-off between generalization and explainability and points to efficiency-driven directions for practical deployment.

Abstract

Overfitting is a well-known issue in machine learning that occurs when a model struggles to generalize its predictions to new, unseen data beyond the scope of its training set. Traditional techniques to mitigate overfitting include early stopping, data augmentation, and regularization. In this work, we demonstrate that the degree of overfitting of a trained model is correlated with the ability to generate counterfactual examples. The higher the overfitting, the easier it will be to find a valid counterfactual example for a randomly chosen input data point. Therefore, we introduce CF-Reg, a novel regularization term in the training loss that controls overfitting by ensuring enough margin between each instance and its corresponding counterfactual. Experiments conducted across multiple datasets and models show that our counterfactual regularizer generally outperforms existing regularization techniques.

Countering Overfitting with Counterfactual Examples

TL;DR

Overfitting degrades generalization, and the authors propose CF-Reg, a regularizer that enforces margins between training examples and their counterfactuals. Grounded in margin theory, CF-Reg is designed to be compatible with any differentiable counterfactual generator and is optimized alongside empirical risk. Empirical results across multiple datasets and architectures show CF-Reg often surpasses traditional regularizers and adversarial training in generalization, while simultaneously producing counterfactual explanations as a by-product. The work highlights a principled trade-off between generalization and explainability and points to efficiency-driven directions for practical deployment.

Abstract

Overfitting is a well-known issue in machine learning that occurs when a model struggles to generalize its predictions to new, unseen data beyond the scope of its training set. Traditional techniques to mitigate overfitting include early stopping, data augmentation, and regularization. In this work, we demonstrate that the degree of overfitting of a trained model is correlated with the ability to generate counterfactual examples. The higher the overfitting, the easier it will be to find a valid counterfactual example for a randomly chosen input data point. Therefore, we introduce CF-Reg, a novel regularization term in the training loss that controls overfitting by ensuring enough margin between each instance and its corresponding counterfactual. Experiments conducted across multiple datasets and models show that our counterfactual regularizer generally outperforms existing regularization techniques.

Paper Structure

This paper contains 27 sections, 11 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: Distance between an input data point ($\boldsymbol{x}$) and its counterfactual example ($\widetilde{\boldsymbol{x}}$): On average, this may be higher for a well-trained model (a) than an overfitted model (b).
  • Figure 2: Evolution of the empirical distribution of margin distances for training data points in the Water dataset across different training epochs of a logistic regression model. As the training progresses, the average margin distance decreases.
  • Figure 3: The $\varepsilon$-valid counterfactual probability for a sample $\boldsymbol{x} \in \mathbb{R}^2$ can be estimated as the ratio of the area of the circle centered in $\boldsymbol{x}$ with radius $\varepsilon$ that falls behind the decision boundary (in red).
  • Figure 4: The mean $\varepsilon$-VCP ($y$-axis) vs. the model's training accuracy ($x$-axis). Plain$_{\varepsilon\text{-VCP}}$ is the "vanilla" MLP, while Regularized$_{\varepsilon\text{-VCP}}$ is the same MLP yet with a dropout rate of $0.5$.
  • Figure 5: The test accuracy of an MLP trained on the Water dataset, evaluated while varying the weight of our counterfactual regularizer ($\alpha$) for different values of $\beta$.
  • ...and 2 more figures