Table of Contents
Fetching ...

Lower Bounds on Adversarial Robustness for Multiclass Classification with General Loss Functions

Camilo Andrés García Trillos, Nicolás García Trillos

TL;DR

This work addresses sharp lower bounds on adversarial risk for multiclass classification under general losses by formulating a learner-adversary minmax problem and deriving dual and generalized-barycenter reformulations. The main contributions include explicit dual representations and optimal robust classifiers for cross-entropy, $\alpha$-logarithmic, and quadratic losses, plus a unifying barycenter view with KL or Tsallis-entropy penalties that links adversarial robustness to α-fair packing. The theoretical framework reveals connections to optimal transport and generalized barycenters, enabling computational advantages and flexible relaxations. Empirical results on synthetic data and MNIST demonstrate tighter lower bounds than 0-1 baselines and confirm the practicality and scalability of the dual/barycenter approach for multiclass adversarial robustness.

Abstract

We consider adversarially robust classification in a multiclass setting under arbitrary loss functions and derive dual and barycentric reformulations of the corresponding learner-agnostic robust risk minimization problem. We provide explicit characterizations for important cases such as the cross-entropy loss, loss functions with a power form, and the quadratic loss, extending in this way available results for the 0-1 loss. These reformulations enable efficient computation of sharp lower bounds for adversarial risks and facilitate the design of robust classifiers beyond the 0-1 loss setting. Our paper uncovers interesting connections between adversarial robustness, $α$-fair packing problems, and generalized barycenter problems for arbitrary positive measures where Kullback-Leibler and Tsallis entropies are used as penalties. Our theoretical results are accompanied with illustrative numerical experiments where we obtain tighter lower bounds for adversarial risks with the cross-entropy loss function.

Lower Bounds on Adversarial Robustness for Multiclass Classification with General Loss Functions

TL;DR

This work addresses sharp lower bounds on adversarial risk for multiclass classification under general losses by formulating a learner-adversary minmax problem and deriving dual and generalized-barycenter reformulations. The main contributions include explicit dual representations and optimal robust classifiers for cross-entropy, -logarithmic, and quadratic losses, plus a unifying barycenter view with KL or Tsallis-entropy penalties that links adversarial robustness to α-fair packing. The theoretical framework reveals connections to optimal transport and generalized barycenters, enabling computational advantages and flexible relaxations. Empirical results on synthetic data and MNIST demonstrate tighter lower bounds than 0-1 baselines and confirm the practicality and scalability of the dual/barycenter approach for multiclass adversarial robustness.

Abstract

We consider adversarially robust classification in a multiclass setting under arbitrary loss functions and derive dual and barycentric reformulations of the corresponding learner-agnostic robust risk minimization problem. We provide explicit characterizations for important cases such as the cross-entropy loss, loss functions with a power form, and the quadratic loss, extending in this way available results for the 0-1 loss. These reformulations enable efficient computation of sharp lower bounds for adversarial risks and facilitate the design of robust classifiers beyond the 0-1 loss setting. Our paper uncovers interesting connections between adversarial robustness, -fair packing problems, and generalized barycenter problems for arbitrary positive measures where Kullback-Leibler and Tsallis entropies are used as penalties. Our theoretical results are accompanied with illustrative numerical experiments where we obtain tighter lower bounds for adversarial risks with the cross-entropy loss function.

Paper Structure

This paper contains 23 sections, 17 theorems, 155 equations, 4 figures.

Key Result

Theorem 3

Under Assumption assump:LossFunction on the loss function $\ell$ and Assumption assump:Cost on the cost function $c$, problem eqn:ATGeneralLoss with $\mathcal{F} = \mathcal{F}_{\mathrm{all}}$ has the same value as the problem when $\mathcal{G}$ is taken to be $C_b(\mathcal{X})$, the space of bounded continuous functions on $\mathcal{X}$. Here and in the sequel, we use $\mathrm{spt}(\mu_A)$ to den

Figures (4)

  • Figure 1: Left: Plot of $\log_\alpha$ when $\alpha \in [0,1)$. The function cuts the vertical axis at the value $-\frac{1}{1-\alpha}$ and diverges to $\infty$ as the argument of the function gets larger. Right: Plot of $\log_{\alpha}$ for $\alpha>1$. The function has a horizontal asymptote at $-\frac{1}{1-\alpha}$ and a vertical one at $0$. For both cases, and regardless of the value of $\alpha$, the function $\log_\alpha$ cuts the horizontal axis at the value $1$.
  • Figure 2: Left: Position of masses for initial measures $(\mu_i)_{i=0,1,2}$; Right: Adversarial risk for different $\alpha$. As expected, plots are monotonic with respect to the adversarial budget, and converge to the risk of full confusion between labels. Notice, also, that plots are monotonic in $\alpha$ for a fixed budget.
  • Figure 3: Top left: original data points. Remaining subplots: optimal classifier for each group in the case $\varepsilon =1,\alpha =1$ (i.e. cross-entropy). The value is represented in terms of opaqueness of the interior (higher value, higher opaqueness). The original group is represented by the edge color. Arrows highlight significant differences ($>0.1$) with optimal classifier with same adversarial budget but $\alpha=0$ (0-1 loss). The direction of the arrow indicates the sign of this difference.
  • Figure 4: Adversarial risk as a function of adversarial budget for the MNIST test

Theorems & Definitions (49)

  • Theorem 3
  • Remark 4
  • Remark 5
  • Remark 6
  • Theorem 7
  • Corollary 8: Form of optimal classifier for the cross-entropy loss
  • Remark 9
  • Corollary 10: Barycenter formulation for the cross-entropy loss
  • Corollary 11: Cross-entropy loss with $0$-$\infty$ cost
  • Remark 12
  • ...and 39 more