BruSLeAttack: A Query-Efficient Score-Based Black-Box Sparse Adversarial Attack
Viet Quoc Vo, Ehsan Abbasnejad, Damith C. Ranasinghe
TL;DR
This paper addresses the hard problem of sparse adversarial perturbations under score-based black-box queries by reformulating the search into a lower-dimensional discrete space using a fixed synthetic color image $x'$ and a binary mask $u$. A Bayesian framework with a Dirichlet-parameterized search distribution learns pixel-level influence from history, guiding iterative sampling biased by a dissimilarity map to efficiently seek an $l_0$-constrained perturbation. The resulting BrusLeAttack achieves state-of-the-art attack success rates and query efficiency on ImageNet across CNNs and ViTs, and demonstrates practical impact by successfully attacking a real-world MLaaS system (Google Cloud Vision) and evaluating defenses. Overall, the method provides a scalable, principled approach for rapid vulnerability assessment of vision models in black-box settings, with artifacts available on GitHub to facilitate reproducibility and further study.
Abstract
We study the unique, less-well understood problem of generating sparse adversarial samples simply by observing the score-based replies to model queries. Sparse attacks aim to discover a minimum number-the l0 bounded-perturbations to model inputs to craft adversarial examples and misguide model decisions. But, in contrast to query-based dense attack counterparts against black-box models, constructing sparse adversarial perturbations, even when models serve confidence score information to queries in a score-based setting, is non-trivial. Because, such an attack leads to i) an NP-hard problem; and ii) a non-differentiable search space. We develop the BruSLeAttack-a new, faster (more query-efficient) Bayesian algorithm for the problem. We conduct extensive attack evaluations including an attack demonstration against a Machine Learning as a Service (MLaaS) offering exemplified by Google Cloud Vision and robustness testing of adversarial training regimes and a recent defense against black-box attacks. The proposed attack scales to achieve state-of-the-art attack success rates and query efficiency on standard computer vision tasks such as ImageNet across different model architectures. Our artefacts and DIY attack samples are available on GitHub. Importantly, our work facilitates faster evaluation of model vulnerabilities and raises our vigilance on the safety, security and reliability of deployed systems.
