Expressive Losses for Verified Robustness via Convex Combinations

Alessandro De Palma; Rudy Bunel; Krishnamurthy Dvijotham; M. Pawan Kumar; Robert Stanforth; Alessio Lomuscio

Expressive Losses for Verified Robustness via Convex Combinations

Alessandro De Palma, Rudy Bunel, Krishnamurthy Dvijotham, M. Pawan Kumar, Robert Stanforth, Alessio Lomuscio

TL;DR

This work tackles the gap between empirical adversarial robustness and formal verifiability by introducing the concept of loss expressivity: a family of losses parameterized by $\alpha \in [0,1]$ that interpolates between the adversarial loss and a verifiable loss. It shows that simple convex-combination instantiations (CC-IBP, MTL-IBP, Exp-IBP) can achieve state-of-the-art robustness–accuracy trade-offs across multiple vision benchmarks, supporting the claim that expressivity is the key driver of performance. The authors connect expressivity to existing methods like SABR and provide extensive experiments demonstrating the nuanced role of the over-approximation coefficient, including when better worst-case approximations do not guarantee better results. Code and pseudo-code are released to enable reproducibility and further exploration of expressive losses.

Abstract

In order to train networks for verified adversarial robustness, it is common to over-approximate the worst-case loss over perturbation regions, resulting in networks that attain verifiability at the expense of standard performance. As shown in recent work, better trade-offs between accuracy and robustness can be obtained by carefully coupling adversarial training with over-approximations. We hypothesize that the expressivity of a loss function, which we formalize as the ability to span a range of trade-offs between lower and upper bounds to the worst-case loss through a single parameter (the over-approximation coefficient), is key to attaining state-of-the-art performance. To support our hypothesis, we show that trivial expressive losses, obtained via convex combinations between adversarial attacks and IBP bounds, yield state-of-the-art results across a variety of settings in spite of their conceptual simplicity. We provide a detailed analysis of the relationship between the over-approximation coefficient and performance profiles across different expressive losses, showing that, while expressivity is essential, better approximations of the worst-case loss are not necessarily linked to superior robustness-accuracy trade-offs.

Expressive Losses for Verified Robustness via Convex Combinations

TL;DR

This work tackles the gap between empirical adversarial robustness and formal verifiability by introducing the concept of loss expressivity: a family of losses parameterized by

that interpolates between the adversarial loss and a verifiable loss. It shows that simple convex-combination instantiations (CC-IBP, MTL-IBP, Exp-IBP) can achieve state-of-the-art robustness–accuracy trade-offs across multiple vision benchmarks, supporting the claim that expressivity is the key driver of performance. The authors connect expressivity to existing methods like SABR and provide extensive experiments demonstrating the nuanced role of the over-approximation coefficient, including when better worst-case approximations do not guarantee better results. Code and pseudo-code are released to enable reproducibility and further exploration of expressive losses.

Abstract

Paper Structure (51 sections, 11 theorems, 25 equations, 4 figures, 11 tables, 1 algorithm)

This paper contains 51 sections, 11 theorems, 25 equations, 4 figures, 11 tables, 1 algorithm.

Introduction
Background
Adversarial Training
Neural Network Verification
Training for Verified Robustness
Loss Expressivity for Verified Training
Expressivity through Convex Combinations
CC-IBP
MTL-IBP
Exp-IBP
Related Work
Experimental Evaluation
Comparison with Literature Results
Sensitivity to Over-approximation Coefficient
Branch-and-Bound Loss and Approximations
...and 36 more sections

Key Result

Proposition 4.0

If $\mathcal{L}(\cdot, y)$ is continuous with respect to its first argument, the parametrized loss $\mathcal{L}_{\alpha, \text{CC}} (\bm{\theta}, \mathbf{x}, y)$ is expressive according to definition def:expressive.

Figures (4)

Figure 1: Sensitivity of CC-IBP, MTL-IBP and Exp-IBP to the convex combination coefficient $\alpha$. We report standard, adversarial and verified robust accuracies (with different verifiers) under $\ell_\infty$ perturbations on the first $1000$ images of the CIFAR-10 test set. The legend in plot \ref{['fig:alpha-sensitivity-mtlibp-2']} applies to all sub-figures.
Figure 2: Relationship between the over-approximation coefficient $\alpha$ and the standard loss, the branch-and-bound loss, its approximation error and RMSE (relative to the expressive loss employed during training) on a holdout validation set of CIFAR-10. While the standard loss is computed on the entire validation set, branch-and-bound-related statistics are limited to the first $100$ images. The dashed vertical line denotes the $\alpha$ value minimizing the sum of the standard and branch-and-bound losses. The legend in plot \ref{['fig:2-255-tuning-mtlibp']} applies to all sub-figures.
Figure 3: IBP ReLU over-approximation, with $\hat{x} \in [\hat{l}, \hat{u}]$ for the considered input domain.
Figure 4: Loss values, computed on the full CIFAR-10 test set, for the models from Figure \ref{['fig:alpha-sensitivity']}. The CC-IBP, MTL-IBP and Exp-IBP losses (see $\S$\ref{['sec:convexcombinations']}) are computed using the $\alpha$ value employed for training, denoted on the $x$ axis. The legend in plot \ref{['fig:alpha-sensitivity-loss-mtlibp-2']} applies to all sub-figures.

Theorems & Definitions (20)

Definition 3.1
Proposition 4.0
Proposition 4.0
Proposition 4.0
Proposition B.0
proof
Proposition B.0
proof
Proposition B.0
proof
...and 10 more

Expressive Losses for Verified Robustness via Convex Combinations

TL;DR

Abstract

Expressive Losses for Verified Robustness via Convex Combinations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (20)