L-AutoDA: Leveraging Large Language Models for Automated Decision-based Adversarial Attacks
Ping Guo, Fei Liu, Xi Lin, Qingchuan Zhao, Qingfu Zhang
TL;DR
<3-5 sentence high-level summary>The paper tackles robustness against decision-based adversarial attacks, formalized as finding a minimal $||\delta||_2$ with $C(x_0+\delta) \neq C(x_0)$. It introduces L-AutoDA, which uses a Large Language Model within the Evolutionary AutoML (AEL) framework to automatically generate and test decision-based attack algorithms from scratch. On CIFAR-10 with a ResNet-18 backbone, L-AutoDA achieves superior mean perturbation distances and higher attack success rates at large query budgets compared to baselines like Boundary Attack and HopSkipJump. This demonstrates that LLM-driven automated algorithm design can rapidly produce potent attack strategies and highlights new avenues for robust AI defenses.
Abstract
In the rapidly evolving field of machine learning, adversarial attacks present a significant challenge to model robustness and security. Decision-based attacks, which only require feedback on the decision of a model rather than detailed probabilities or scores, are particularly insidious and difficult to defend against. This work introduces L-AutoDA (Large Language Model-based Automated Decision-based Adversarial Attacks), a novel approach leveraging the generative capabilities of Large Language Models (LLMs) to automate the design of these attacks. By iteratively interacting with LLMs in an evolutionary framework, L-AutoDA automatically designs competitive attack algorithms efficiently without much human effort. We demonstrate the efficacy of L-AutoDA on CIFAR-10 dataset, showing significant improvements over baseline methods in both success rate and computational efficiency. Our findings underscore the potential of language models as tools for adversarial attack generation and highlight new avenues for the development of robust AI systems.
