Tight Robustness Certification through the Convex Hull of $\ell_0$ Attacks
Yuval Shapira, Dana Drachsler-Cohen
TL;DR
The paper tackles local robustness certification against few-pixel attacks, where the perturbation space is a non-convex $\ell_0$-ball. It shows that the convex hull of the $\ell_0$-ball around $\bar{x}$ is the intersection of the bounding box $\mathcal{D}$ with an asymmetrically scaled $\ell_1$-like polytope, and that this hull yields nearly identical volumes to the polytope in high dimensions. A linear bound propagation is derived to compute exact min/max of linear functions over the hull, yielding tighter bounds than those from box or $\ell_1$-ball relaxations, and this bound is extended to multi-channel inputs. The method is integrated into GPUPoly to boost CoVerD, the state-of-the-art complete $\ell_0$-robustness verifier, achieving speedups of $1.24$ to $7.07$ times (geometric mean $3.16$) on challenging benchmarks across MNIST, Fashion-MNIST, and CIFAR-10. This work enables scalable, tighter verification for $\ell_0$ perturbations, broadening the practical safety guarantees of neural classifiers under sparse adversarial modifications.
Abstract
Few-pixel attacks mislead a classifier by modifying a few pixels of an image. Their perturbation space is an $\ell_0$-ball, which is not convex, unlike $\ell_p$-balls for $p\geq1$. However, existing local robustness verifiers typically scale by relying on linear bound propagation, which captures convex perturbation spaces. We show that the convex hull of an $\ell_0$-ball is the intersection of its bounding box and an asymmetrically scaled $\ell_1$-like polytope. The volumes of the convex hull and this polytope are nearly equal as the input dimension increases. We then show a linear bound propagation that precisely computes bounds over the convex hull and is significantly tighter than bound propagations over the bounding box or our $\ell_1$-like polytope. This bound propagation scales the state-of-the-art $\ell_0$ verifier on its most challenging robustness benchmarks by 1.24x-7.07x, with a geometric mean of 3.16.
