Table of Contents
Fetching ...

ABounD: Adversarial Boundary-Driven Few-Shot Learning for Multi-Class Anomaly Detection

Runzhi Deng, Yundi Hu, Xinshuang Zhang, Zhao Wang, Xixi Liu, Wang-Zhou Dai, Caifeng Shan, Fang Zhao

TL;DR

ABounD tackles the challenge of few-shot, multi-class industrial anomaly detection by unifying semantic concept learning with boundary shaping. It introduces Dynamic Concept Fusion to generate class-adaptive prompts and Adversarial Boundary Forging to forge a boundary-friendly feature space via PGD-based perturbations, all optimized with a single Concept-Boundary Loss. The method achieves state-of-the-art results on MVTec-AD and VisA across 1-, 2-, and 4-shot settings, with strong localization and robustness across backbones. This boundary-driven, cross-modal framework offers practical advantages for scalable industrial anomaly detection with limited labeled data.

Abstract

Few-shot multi-class industrial anomaly detection remains a challenging task. Vision-language models need to be both category-adaptive and sharply discriminative, yet data scarcity often blurs the boundary between normal and abnormal states. This ambiguity leads to missed subtle defects and the rejection of atypical normal samples. We propose ABounD, an Adversarial Boundary-Driven few-shot learning for multi-class anomaly detection, which is a unified learning framework that integrates semantic concept learning with decision boundary shaping. The Dynamic Concept Fusion (DCF) module produces class-adaptive prompts by fusing generalizable priors with class-specific cues, conditioned on image features. Meanwhile, Adversarial Boundary Forging (ABF) sculpts a more precise decision margin by generating boundary-level fence features via PGD-style perturbations. Training is conducted in a single stage under a Concept-Boundary Loss, where ABF provides the main supervisory signal and semantic-spatial regularizers stabilize the optimization. This synergy yields a decision boundary that closely follows normal data while preserving flexibility and robust semantic alignment. Experiments on MVTec-AD and VisA datasets demonstrate state-of-the-art performance in the task of few-shot multi-class anomaly detection.

ABounD: Adversarial Boundary-Driven Few-Shot Learning for Multi-Class Anomaly Detection

TL;DR

ABounD tackles the challenge of few-shot, multi-class industrial anomaly detection by unifying semantic concept learning with boundary shaping. It introduces Dynamic Concept Fusion to generate class-adaptive prompts and Adversarial Boundary Forging to forge a boundary-friendly feature space via PGD-based perturbations, all optimized with a single Concept-Boundary Loss. The method achieves state-of-the-art results on MVTec-AD and VisA across 1-, 2-, and 4-shot settings, with strong localization and robustness across backbones. This boundary-driven, cross-modal framework offers practical advantages for scalable industrial anomaly detection with limited labeled data.

Abstract

Few-shot multi-class industrial anomaly detection remains a challenging task. Vision-language models need to be both category-adaptive and sharply discriminative, yet data scarcity often blurs the boundary between normal and abnormal states. This ambiguity leads to missed subtle defects and the rejection of atypical normal samples. We propose ABounD, an Adversarial Boundary-Driven few-shot learning for multi-class anomaly detection, which is a unified learning framework that integrates semantic concept learning with decision boundary shaping. The Dynamic Concept Fusion (DCF) module produces class-adaptive prompts by fusing generalizable priors with class-specific cues, conditioned on image features. Meanwhile, Adversarial Boundary Forging (ABF) sculpts a more precise decision margin by generating boundary-level fence features via PGD-style perturbations. Training is conducted in a single stage under a Concept-Boundary Loss, where ABF provides the main supervisory signal and semantic-spatial regularizers stabilize the optimization. This synergy yields a decision boundary that closely follows normal data while preserving flexibility and robust semantic alignment. Experiments on MVTec-AD and VisA datasets demonstrate state-of-the-art performance in the task of few-shot multi-class anomaly detection.

Paper Structure

This paper contains 48 sections, 20 equations, 9 figures, 16 tables.

Figures (9)

  • Figure 1: Decision boundary intuition for few-shot multi-class anomaly detection. (a)Original CLIPradford2021learning forms a loose boundary that overlaps anomalies, causing false negatives. (b)Existing few-shot methodsli2024promptadzhu2024towardzhou2023anomalyclip often yield overly tight or vague boundaries: atypical but normal samples are rejected, while subtle defects are overlooked. (c)ABounD jointly learns Dynamic Concept Fusion (DCF) and Adversarial Boundary Forging (ABF). DCF produces class-adaptive prompts; ABF forges fence features via PGD madry2017towards to shape a sharp, normal-hugging decision boundary. The whole framework is optimized in a unified single stage under the Concept-Boundary Loss (CBL).
  • Figure 2: Overview of the ABounD framework, which introduces two core components: Dynamic Concept Fusion (DCF) and Adversarial Boundary Forging (ABF). DCF dynamically generates class-adaptive prompts by fusing generalizable Mixture-of-Experts (MoE) semantics with learnable class-specific cues, conditioned on the global visual representation. ABF guides feature learning by performing PGD-based adversarial perturbations near the decision boundary, with balance and dispersion objectives ensuring well-distributed hard examples. The entire framework is optimized in a unified single-stage manner under the Concept-Boundary Loss (CBL), enabling robust few-shot multi-class anomaly detection.
  • Figure 3: t-SNE visualization of the learned feature space for representative classes under the 1-shot setting: "bottle" and "metal_nut" from MVTec-AD, and "fryum" and "pcb1" from VisA.
  • Figure 4: Qualitative visualization of anomaly localization results on MVTec-AD and VisA under the 1-shot setting. Compared methods include PromptAD, IIPAD, and our ABounD. Our method achieves clearer and more precise localization boundaries, particularly on small or fine-grained anomaly regions.
  • Figure 5: Image/pixel AUROC on MVTec-AD under 1-shot setting using different settings of $\lambda_{\mathrm{psg}}$, $\lambda_{\mathrm{seg}}$, $dispersion \ \beta$, and $adversarial \ generation \ steps$.
  • ...and 4 more figures