Table of Contents
Fetching ...

PAC-Bayesian Bounds on Constrained f-Entropic Risk Measures

Hind Atbir, Farah Cherfaoui, Guillaume Metzler, Emilie Morvant, Paul Viallard

TL;DR

The paper tackles generalization under subgroup imbalances by introducing constrained $f$-entropic risk measures, which extend CVaR via $f$-divergences to control distributional shifts. It develops both classical and disintegrated PAC-Bayesian bounds for these risks in two regimes, and shows they can be minimized with a self-bounding algorithm that provides subgroup-level guarantees. The theoretical bounds are instantiated with concrete deviation functions and accelerated by a practical optimization setup, while experiments on imbalanced OpenML datasets illustrate improved subgroup balance and competitive performance. This work advances reliable, subgroup-aware generalization guarantees and offers a practical route to bound-aware model learning.

Abstract

PAC generalization bounds on the risk, when expressed in terms of the expected loss, are often insufficient to capture imbalances between subgroups in the data. To overcome this limitation, we introduce a new family of risk measures, called constrained f-entropic risk measures, which enable finer control over distributional shifts and subgroup imbalances via f-divergences, and include the Conditional Value at Risk (CVaR), a well-known risk measure. We derive both classical and disintegrated PAC-Bayesian generalization bounds for this family of risks, providing the first disintegratedPAC-Bayesian guarantees beyond standard risks. Building on this theory, we design a self-bounding algorithm that minimizes our bounds directly, yielding models with guarantees at the subgroup level. Finally, we empirically demonstrate the usefulness of our approach.

PAC-Bayesian Bounds on Constrained f-Entropic Risk Measures

TL;DR

The paper tackles generalization under subgroup imbalances by introducing constrained -entropic risk measures, which extend CVaR via -divergences to control distributional shifts. It develops both classical and disintegrated PAC-Bayesian bounds for these risks in two regimes, and shows they can be minimized with a self-bounding algorithm that provides subgroup-level guarantees. The theoretical bounds are instantiated with concrete deviation functions and accelerated by a practical optimization setup, while experiments on imbalanced OpenML datasets illustrate improved subgroup balance and competitive performance. This work advances reliable, subgroup-aware generalization guarantees and offers a practical route to bound-aware model learning.

Abstract

PAC generalization bounds on the risk, when expressed in terms of the expected loss, are often insufficient to capture imbalances between subgroups in the data. To overcome this limitation, we introduce a new family of risk measures, called constrained f-entropic risk measures, which enable finer control over distributional shifts and subgroup imbalances via f-divergences, and include the Conditional Value at Risk (CVaR), a well-known risk measure. We derive both classical and disintegrated PAC-Bayesian generalization bounds for this family of risks, providing the first disintegratedPAC-Bayesian guarantees beyond standard risks. Building on this theory, we design a self-bounding algorithm that minimizes our bounds directly, yielding models with guarantees at the subgroup level. Finally, we empirically demonstrate the usefulness of our approach.

Paper Structure

This paper contains 40 sections, 17 theorems, 82 equations, 10 figures, 1 table, 2 algorithms.

Key Result

Theorem 1

For any distribution $D$ over $\mathcal{X}{\times}\mathcal{Y}$, for any prior $P \! \in \! \mathcal{M}(\mathcal{H})$, for any loss $\ell : \mathcal{Y} \!\times \mathcal{Y} \!\rightarrow [0,1]$, for any $\alpha \!\in \!(0,1]$, for any $\delta\!\in\!(0,1]$, with probability at least $1 {-} \delta$ ov

Figures (10)

  • Figure 1: Bound values (in color), test risk $\mathcal{R}_{\mathcal{T}}$ (in grey), and F-score value on $\mathcal{T}$ (with their standard deviations) for \ref{['thm:bound-for-one-example']}, \ref{['cor:mca-dis']}, and \ref{['thm:mhammedi']}, in function of $\alpha$ (on the $x$-axis). The $y$-axis corresponds to the value of the bounds and test risks. The highest F-score for each dataset is emphasized with a red frame.
  • Figure 2: Evolution of the class-wise error rates and standard deviation on the set $\mathcal{T}$ ($y$-axis) in function of the parameter $\alpha$ ($x$-axis) with \ref{['cor:mca-dis']}. Each class is represented by different markers and colors.
  • Figure 3: 2-hidden layer MLP with CVaR. Bound values (in color), test risk $\mathcal{R}_{\mathcal{T}}$ (in grey), and F-score value on $\mathcal{T}$ (with their standard deviations) for \ref{['thm:bound-for-one-example']}, \ref{['cor:mca-dis']}, and \ref{['thm:mhammedi']}, in function of $\alpha$ (on the $x$-axis). The $y$-axis corresponds to the value of the bounds and test risks. The highest F-score for each dataset is emphasized with a red frame.
  • Figure 4: Perceptron with CVaR. Bound values (in color), test risk $\mathcal{R}_{\mathcal{T}}$ (in grey), and F-score value on $\mathcal{T}$ (with their standard deviations) for \ref{['thm:bound-for-one-example']}, \ref{['cor:mca-dis']}, and \ref{['thm:mhammedi']}, in function of $\alpha$ (on the $x$-axis). The $y$-axis corresponds to the value of the bounds and test risks. The highest F-score for each dataset is emphasized with a red frame.
  • Figure 5: 2-hidden layer MLP with CVaR. Evolution of the class-wise error rates and standard deviation on the set $\mathcal{T}$ ($y$-axis) in function of the parameter $\alpha$ ($x$-axis) with \ref{['cor:mca-dis']}. Each class is represented by different markers and colors.
  • ...and 5 more figures

Theorems & Definitions (38)

  • Definition 1
  • Theorem 1: PAC-Bayesian Bound on CVaR mhammedi2020pac
  • Definition 2
  • Theorem 2
  • proof
  • Corollary 1
  • proof
  • Theorem 3
  • proof
  • Theorem 3
  • ...and 28 more