Certified Robust Accuracy of Neural Networks Are Bounded due to Bayes Errors
Ruihan Zhang, Jun Sun
TL;DR
The paper addresses whether there exists a fundamental limit to certified robust accuracy beyond empirical optimization. By framing robustness through Bayes error and showing that robust training effectively convolving the data distribution with a vicinity $v$, it proves that Bayes error can only increase ($\beta_{D'} \ge \beta_D$) and derives an irreducible robustness bound $1 - \zeta^\sharp_D$ that depends on the data distribution and vicinity. The authors validate the theory with experiments on Moons, Chan, FashionMNIST, and CIFAR-10, demonstrating that the theoretical upper bound on robustness is higher than current certified-robust accuracies and that the bound tightens as the vicinity grows. These results imply a distributional component to the robustness-accuracy trade-off and motivate rethinking robustness objectives and evaluation strategies in practice.
Abstract
Adversarial examples pose a security threat to many critical systems built on neural networks. While certified training improves robustness, it also decreases accuracy noticeably. Despite various proposals for addressing this issue, the significant accuracy drop remains. More importantly, it is not clear whether there is a certain fundamental limit on achieving robustness whilst maintaining accuracy. In this work, we offer a novel perspective based on Bayes errors. By adopting Bayes error to robustness analysis, we investigate the limit of certified robust accuracy, taking into account data distribution uncertainties. We first show that the accuracy inevitably decreases in the pursuit of robustness due to changed Bayes error in the altered data distribution. Subsequently, we establish an upper bound for certified robust accuracy, considering the distribution of individual classes and their boundaries. Our theoretical results are empirically evaluated on real-world datasets and are shown to be consistent with the limited success of existing certified training results, e.g., for CIFAR10, our analysis results in an upper bound (of certified robust accuracy) of 67.49\%, meanwhile existing approaches are only able to increase it from 53.89\% in 2017 to 62.84\% in 2023.
