Accelerating Targeted Hard-Label Adversarial Attacks in Low-Query Black-Box Settings
Arjhun Swaminathan, Mete Akgün
TL;DR
The paper tackles the challenge of crafting targeted adversarial examples in hard-label black-box settings with very limited queries. It introduces Targeted Edge-Informed Attack (TEA), a two-stage method that first uses a Sobel-derived soft edge mask to perform a global, edge-preserving perturbation toward a source image, then refines locally with patch-based updates, delaying reliance on local decision-boundary geometry until necessary. TEA demonstrates substantial query efficiency and robustness across ImageNet models, achieving roughly 70% fewer queries than prior state-of-the-art methods to reach meaningful distance reductions, and further benefits from switching to CGBA-H in higher-query regimes. The approach underscores the value of leveraging intrinsic target-image structure (edges) for efficient hard-label attacks and offers practical implications for both attackers and defenders in real-world, low-query scenarios.
Abstract
Deep neural networks for image classification remain vulnerable to adversarial examples -- small, imperceptible perturbations that induce misclassifications. In black-box settings, where only the final prediction is accessible, crafting targeted attacks that aim to misclassify into a specific target class is particularly challenging due to narrow decision regions. Current state-of-the-art methods often exploit the geometric properties of the decision boundary separating a source image and a target image rather than incorporating information from the images themselves. In contrast, we propose Targeted Edge-informed Attack (TEA), a novel attack that utilizes edge information from the target image to carefully perturb it, thereby producing an adversarial image that is closer to the source image while still achieving the desired target classification. Our approach consistently outperforms current state-of-the-art methods across different models in low query settings (nearly 70% fewer queries are used), a scenario especially relevant in real-world applications with limited queries and black-box access. Furthermore, by efficiently generating a suitable adversarial example, TEA provides an improved target initialization for established geometry-based attacks.
