L-AutoDA: Leveraging Large Language Models for Automated Decision-based Adversarial Attacks

Ping Guo; Fei Liu; Xi Lin; Qingchuan Zhao; Qingfu Zhang

L-AutoDA: Leveraging Large Language Models for Automated Decision-based Adversarial Attacks

Ping Guo, Fei Liu, Xi Lin, Qingchuan Zhao, Qingfu Zhang

TL;DR

<3-5 sentence high-level summary>The paper tackles robustness against decision-based adversarial attacks, formalized as finding a minimal $||\delta||_2$ with $C(x_0+\delta) \neq C(x_0)$. It introduces L-AutoDA, which uses a Large Language Model within the Evolutionary AutoML (AEL) framework to automatically generate and test decision-based attack algorithms from scratch. On CIFAR-10 with a ResNet-18 backbone, L-AutoDA achieves superior mean perturbation distances and higher attack success rates at large query budgets compared to baselines like Boundary Attack and HopSkipJump. This demonstrates that LLM-driven automated algorithm design can rapidly produce potent attack strategies and highlights new avenues for robust AI defenses.

Abstract

In the rapidly evolving field of machine learning, adversarial attacks present a significant challenge to model robustness and security. Decision-based attacks, which only require feedback on the decision of a model rather than detailed probabilities or scores, are particularly insidious and difficult to defend against. This work introduces L-AutoDA (Large Language Model-based Automated Decision-based Adversarial Attacks), a novel approach leveraging the generative capabilities of Large Language Models (LLMs) to automate the design of these attacks. By iteratively interacting with LLMs in an evolutionary framework, L-AutoDA automatically designs competitive attack algorithms efficiently without much human effort. We demonstrate the efficacy of L-AutoDA on CIFAR-10 dataset, showing significant improvements over baseline methods in both success rate and computational efficiency. Our findings underscore the potential of language models as tools for adversarial attack generation and highlight new avenues for the development of robust AI systems.

L-AutoDA: Leveraging Large Language Models for Automated Decision-based Adversarial Attacks

TL;DR

<3-5 sentence high-level summary>The paper tackles robustness against decision-based adversarial attacks, formalized as finding a minimal

with

. It introduces L-AutoDA, which uses a Large Language Model within the Evolutionary AutoML (AEL) framework to automatically generate and test decision-based attack algorithms from scratch. On CIFAR-10 with a ResNet-18 backbone, L-AutoDA achieves superior mean perturbation distances and higher attack success rates at large query budgets compared to baselines like Boundary Attack and HopSkipJump. This demonstrates that LLM-driven automated algorithm design can rapidly produce potent attack strategies and highlights new avenues for robust AI defenses.

Abstract

Paper Structure (20 sections, 4 equations, 4 figures, 4 tables, 3 algorithms)

This paper contains 20 sections, 4 equations, 4 figures, 4 tables, 3 algorithms.

Introduction
Related Works
Decision-based Adversarial Attacks
Automatically Devising Adversarial Attacks.
Preliminaries
Decision-based Adversarial Attacks
Algorithm Evolution using LLMs
L-AutoDA: LLM-based Automated Decision-based Adversarial Attacks
Decision-based Attack Framework
L-AutoDA
Implementation
Experiments
Experimental Setup
Algorithm Generation
Attack Evaluation
...and 5 more sections

Figures (4)

Figure 1: Overview of the L-AutoDA Framework Methodology. This diagram delineates the two core components of our L-AutoDA framework: the algorithm generation and testing phases. In the algorithm generation phase, we adopt the AEL framework, leveraging LLMs to guide an evolutionary search process. In the testing phase, we employ existing decision-based attack testing code, integrating these algorithms into the attack program to validate their efficacy.
Figure 2: Performance Trajectories of L-AutoDA. This graph illustrates the comparative efficiency of our L-AutoDA framework against the human-best gradient-based (HopSkipJump Attack) and gradient-free (Boundary Attack) methods. L-AutoDA's candidates demonstrate a breakthrough in the 13th generation, surpassing the reference performance lines and continuing to enhance efficiency in subsequent generations.
Figure 3: Attack Success Rate using different numbers of queries using L-AutoDA-20 and other attack algorithms.
Figure 4: Distance between adversarial examples and original images using different numbers of queries using L-AutoDA-20 and other attack algorithms. The lines denote the mean value of the test pairs and the shaded areas represent a 0.25 multiplier of the standard deviation.

L-AutoDA: Leveraging Large Language Models for Automated Decision-based Adversarial Attacks

TL;DR

Abstract

L-AutoDA: Leveraging Large Language Models for Automated Decision-based Adversarial Attacks

Authors

TL;DR

Abstract

Table of Contents

Figures (4)