Table of Contents
Fetching ...

Certified Robust Accuracy of Neural Networks Are Bounded due to Bayes Errors

Ruihan Zhang, Jun Sun

TL;DR

The paper addresses whether there exists a fundamental limit to certified robust accuracy beyond empirical optimization. By framing robustness through Bayes error and showing that robust training effectively convolving the data distribution with a vicinity $v$, it proves that Bayes error can only increase ($\beta_{D'} \ge \beta_D$) and derives an irreducible robustness bound $1 - \zeta^\sharp_D$ that depends on the data distribution and vicinity. The authors validate the theory with experiments on Moons, Chan, FashionMNIST, and CIFAR-10, demonstrating that the theoretical upper bound on robustness is higher than current certified-robust accuracies and that the bound tightens as the vicinity grows. These results imply a distributional component to the robustness-accuracy trade-off and motivate rethinking robustness objectives and evaluation strategies in practice.

Abstract

Adversarial examples pose a security threat to many critical systems built on neural networks. While certified training improves robustness, it also decreases accuracy noticeably. Despite various proposals for addressing this issue, the significant accuracy drop remains. More importantly, it is not clear whether there is a certain fundamental limit on achieving robustness whilst maintaining accuracy. In this work, we offer a novel perspective based on Bayes errors. By adopting Bayes error to robustness analysis, we investigate the limit of certified robust accuracy, taking into account data distribution uncertainties. We first show that the accuracy inevitably decreases in the pursuit of robustness due to changed Bayes error in the altered data distribution. Subsequently, we establish an upper bound for certified robust accuracy, considering the distribution of individual classes and their boundaries. Our theoretical results are empirically evaluated on real-world datasets and are shown to be consistent with the limited success of existing certified training results, e.g., for CIFAR10, our analysis results in an upper bound (of certified robust accuracy) of 67.49\%, meanwhile existing approaches are only able to increase it from 53.89\% in 2017 to 62.84\% in 2023.

Certified Robust Accuracy of Neural Networks Are Bounded due to Bayes Errors

TL;DR

The paper addresses whether there exists a fundamental limit to certified robust accuracy beyond empirical optimization. By framing robustness through Bayes error and showing that robust training effectively convolving the data distribution with a vicinity , it proves that Bayes error can only increase () and derives an irreducible robustness bound that depends on the data distribution and vicinity. The authors validate the theory with experiments on Moons, Chan, FashionMNIST, and CIFAR-10, demonstrating that the theoretical upper bound on robustness is higher than current certified-robust accuracies and that the bound tightens as the vicinity grows. These results imply a distributional component to the robustness-accuracy trade-off and motivate rethinking robustness objectives and evaluation strategies in practice.

Abstract

Adversarial examples pose a security threat to many critical systems built on neural networks. While certified training improves robustness, it also decreases accuracy noticeably. Despite various proposals for addressing this issue, the significant accuracy drop remains. More importantly, it is not clear whether there is a certain fundamental limit on achieving robustness whilst maintaining accuracy. In this work, we offer a novel perspective based on Bayes errors. By adopting Bayes error to robustness analysis, we investigate the limit of certified robust accuracy, taking into account data distribution uncertainties. We first show that the accuracy inevitably decreases in the pursuit of robustness due to changed Bayes error in the altered data distribution. Subsequently, we establish an upper bound for certified robust accuracy, considering the distribution of individual classes and their boundaries. Our theoretical results are empirically evaluated on real-world datasets and are shown to be consistent with the limited success of existing certified training results, e.g., for CIFAR10, our analysis results in an upper bound (of certified robust accuracy) of 67.49\%, meanwhile existing approaches are only able to increase it from 53.89\% in 2017 to 62.84\% in 2023.
Paper Structure (21 sections, 6 theorems, 42 equations, 8 figures, 1 table, 2 algorithms)

This paper contains 21 sections, 6 theorems, 42 equations, 8 figures, 1 table, 2 algorithms.

Key Result

theorem 1

Given a distribution $D$ for classification, optimising for higher certified robustness does not optimise the classifiers to fit $D$. Rather, it optimises classifiers towards $D*v$, i.e., convolved distribution between $D$ and vicinity $v(\bm{x})$.

Figures (8)

  • Figure 1: The picture at left may look like a cat. In fact, it can be the back of a dog.
  • Figure 2: 1D visualizations of vicinity function and Bayes error. A vicinity function is a rectangular function that returns a constant value if an input is in the vicinity. We use two PDFs of the truncated normal distribution to visualise the Bayes error.
  • Figure 3: Visualizing the convolution of distributions, the marginal contribution to Bayes error, and the bounds of robustness error and certified robust accuracy.
  • Figure 4: The conditional distribution before and after convolution for (a, b) Moons and (c, d) Chan. For both Moons and Chan, $L^\infty$ size is set at $\epsilon=0.15$. We also report the Bayes error to show the change of inherent uncertainty in each distribution.
  • Figure 5: Upper bounds of robustness/accuracy and the state-of-the-art classifier's performance. The $L^\infty$ vicinity size for certified training /certified robust accuracy for each data set is $\epsilon = 0.15, 0.15, 0.1, 2/255$ for Moons, Chan, FashionMNIST, and CIFAR-10.
  • ...and 3 more figures

Theorems & Definitions (16)

  • definition 1: Classifier
  • definition 2: Adversarial example goodfellow2014explainingkurakin2016adversarial
  • definition 3: Classifier robustness against perturbations
  • definition 4: Bayes error
  • theorem 1
  • proof
  • theorem 2
  • proof
  • theorem 3
  • proof
  • ...and 6 more