Automatic Attack Discovery for Few-Shot Class-Incremental Learning via Large Language Models

Haidong Kang; Wei Wu; Hanling Wang

Automatic Attack Discovery for Few-Shot Class-Incremental Learning via Large Language Models

Haidong Kang, Wei Wu, Hanling Wang

TL;DR

This work addresses security vulnerabilities in Few-Shot Class-Incremental Learning (FSCIL) by introducing ACraft, an LLM-driven framework that automatically designs adversarial attack algorithms tailored for FSCIL. ACraft uses a PPO-based closed-loop to iteratively refine attack strategies produced by a Large Language Model, optimizing a Multi-Attribute Utility fitness that balances attack impact and computational cost. Empirical results on miniImageNet, CIFAR-100, and CUB-200 show ACraft significantly degrades state-of-the-art FSCIL methods beyond expert-designed attacks while maintaining low attack costs, and it generalizes across multiple FSCIL frameworks. The approach demonstrates a new direction for robust continual learning, enabling automated discovery of adversarial techniques without human expert intervention.

Abstract

Few-shot class incremental learning (FSCIL) is a more realistic and challenging paradigm in continual learning to incrementally learn unseen classes and overcome catastrophic forgetting on base classes with only a few training examples. Previous efforts have primarily centered around studying more effective FSCIL approaches. By contrast, less attention was devoted to thinking the security issues in contributing to FSCIL. This paper aims to provide a holistic study of the impact of attacks on FSCIL. We first derive insights by systematically exploring how human expert-designed attack methods (i.e., PGD, FGSM) affect FSCIL. We find that those methods either fail to attack base classes, or suffer from huge labor costs due to relying on huge expert knowledge. This highlights the need to craft a specialized attack method for FSCIL. Grounded in these insights, in this paper, we propose a simple yet effective ACraft method to automatically steer and discover optimal attack methods targeted at FSCIL by leveraging Large Language Models (LLMs) without human experts. Moreover, to improve the reasoning between LLMs and FSCIL, we introduce a novel Proximal Policy Optimization (PPO) based reinforcement learning to optimize learning, making LLMs generate better attack methods in the next generation by establishing positive feedback. Experiments on mainstream benchmarks show that our ACraft significantly degrades the performance of state-of-the-art FSCIL methods and dramatically beyond human expert-designed attack methods while maintaining the lowest costs of attack.

Automatic Attack Discovery for Few-Shot Class-Incremental Learning via Large Language Models

TL;DR

Abstract

Automatic Attack Discovery for Few-Shot Class-Incremental Learning via Large Language Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)