Table of Contents
Fetching ...

Few-Shot Anomaly Detection via Category-Agnostic Registration Learning

Chaoqin Huang, Haoyan Guan, Aofan Jiang, Ya Zhang, Michael Spratling, Xinchao Wang, Yanfeng Wang

TL;DR

This article proposes a novel few-shot FSAD framework, which is, to the best knowledge, the first FSAD method that requires no model fine-tuning for novel categories: enabling a single model to be applied to all categories.

Abstract

Most existing anomaly detection (AD) methods require a dedicated model for each category. Such a paradigm, despite its promising results, is computationally expensive and inefficient, thereby failing to meet the requirements for realworld applications. Inspired by how humans detect anomalies, by comparing a query image to known normal ones, this article proposes a novel few-shot AD (FSAD) framework. Using a training set of normal images from various categories, registration, aiming to align normal images of the same categories, is leveraged as the proxy task for self-supervised category-agnostic representation learning. At test time, an image and its corresponding support set, consisting of a few normal images from the same category, are supplied, and anomalies are identified by comparing the registered features of the test image to its corresponding support image features. Such a setup enables the model to generalize to novel test categories. It is, to our best knowledge, the first FSAD method that requires no model fine-tuning for novel categories: enabling a single model to be applied to all categories. Extensive experiments demonstrate the effectiveness of the proposed method. Particularly, it improves the current state-of-the-art (SOTA) for FSAD by 11.3% and 8.3% on the MVTec and MPDD benchmarks, respectively. The source code is available at https://github.com/Haoyan-Guan/CAReg.

Few-Shot Anomaly Detection via Category-Agnostic Registration Learning

TL;DR

This article proposes a novel few-shot FSAD framework, which is, to the best knowledge, the first FSAD method that requires no model fine-tuning for novel categories: enabling a single model to be applied to all categories.

Abstract

Most existing anomaly detection (AD) methods require a dedicated model for each category. Such a paradigm, despite its promising results, is computationally expensive and inefficient, thereby failing to meet the requirements for realworld applications. Inspired by how humans detect anomalies, by comparing a query image to known normal ones, this article proposes a novel few-shot AD (FSAD) framework. Using a training set of normal images from various categories, registration, aiming to align normal images of the same categories, is leveraged as the proxy task for self-supervised category-agnostic representation learning. At test time, an image and its corresponding support set, consisting of a few normal images from the same category, are supplied, and anomalies are identified by comparing the registered features of the test image to its corresponding support image features. Such a setup enables the model to generalize to novel test categories. It is, to our best knowledge, the first FSAD method that requires no model fine-tuning for novel categories: enabling a single model to be applied to all categories. Extensive experiments demonstrate the effectiveness of the proposed method. Particularly, it improves the current state-of-the-art (SOTA) for FSAD by 11.3% and 8.3% on the MVTec and MPDD benchmarks, respectively. The source code is available at https://github.com/Haoyan-Guan/CAReg.
Paper Structure (27 sections, 8 equations, 7 figures, 12 tables)

This paper contains 27 sections, 8 equations, 7 figures, 12 tables.

Figures (7)

  • Figure 1: (a) One-model-per-category paradigm for the vanilla AD and FSAD. (b) One-model-all-category paradigm for the proposed category-agnostic FSAD.
  • Figure 2: (Left) An overview of the architecture of the proposed category-agnostic registration (CAReg) network. Given a train image and a set of support images, features are extracted by three convolutional residual blocks ($C_1$, $C_2$, and $C_3$), each followed by a spatial transformation module ($S_1$, $S_2$, and $S_3$). A feature registration module is leveraged and supervised by a registration loss (Fig. \ref{['img:RegAD_frm']}). (Right) A spatial transformation module, containing a localization network and a differentiable sampler, is used to learn the mappings, enabling the model to transform features to facilitate feature registration.
  • Figure 3: The model architecture of the feature registration module. Given paired registered features, the parameter-shared encoder and predictor are leveraged and supervised by a registration loss.
  • Figure 4: AD methods and their corresponding memory cost, distribution modeling complexity, and inference complexity. Methods are shown by combining registration trained features extracted by CAReg and three statistical-based normal distribution estimators: (a) PaDim defard2021padim, (b) OrthoAD orthoad, and (c) PatchCore patchcore. Symbols used in the complexity equations include: $D$ is the sum of the channel dimensions for the three STN outputs, K is the size of the given support set, $D'$ is a constant and $D' \ll D$, and $\gamma$ denotes the proportion of the original memory bank that has been sampled. Other symbols are explained in the text.
  • Figure 5: Visualization of Wasserstein distances and augmentation selection results. Darker color corresponds to a larger Wasserstein distance. Augmentations marked with $\times$ were removed from augmentations for that category.
  • ...and 2 more figures