Table of Contents
Fetching ...

AdapTS: Lightweight Teacher-Student Approach for Multi-Class and Continual Visual Anomaly Detection

Manuel Barusco, Davide Dalle Pezze, Francesco Borsatti, Gian Antonio Susto

Abstract

Visual Anomaly Detection (VAD) is crucial for industrial inspection, yet most existing methods are limited to single-category scenarios, failing to address the multi-class and continual learning demands of real-world environments. While Teacher-Student (TS) architectures are efficient, they remain unexplored for the Continual Setting. To bridge this gap, we propose AdapTS, a unified TS framework designed for multi-class and continual settings, optimized for edge deployment. AdapTS eliminates the need for two different architectures by utilizing a single shared frozen backbone and injecting lightweight trainable adapters into the student pathway. Training is enhanced via a segmentation-guided objective and synthetic Perlin noise, while a prototype-based task identification mechanism dynamically selects adapters at inference with 99\% accuracy. Experiments on MVTec AD and VisA demonstrate that AdapTS matches the performance of existing TS methods across multi-class and continual learning scenarios, while drastically reducing memory overhead. Our lightest variant, AdapTS-S, requires only 8 MB of additional memory, 13x less than STFPM (95 MB), 48x less than RD4AD (360 MB), and 149x less than DeSTSeg (1120 MB), making it a highly scalable solution for edge deployment in complex industrial environments.

AdapTS: Lightweight Teacher-Student Approach for Multi-Class and Continual Visual Anomaly Detection

Abstract

Visual Anomaly Detection (VAD) is crucial for industrial inspection, yet most existing methods are limited to single-category scenarios, failing to address the multi-class and continual learning demands of real-world environments. While Teacher-Student (TS) architectures are efficient, they remain unexplored for the Continual Setting. To bridge this gap, we propose AdapTS, a unified TS framework designed for multi-class and continual settings, optimized for edge deployment. AdapTS eliminates the need for two different architectures by utilizing a single shared frozen backbone and injecting lightweight trainable adapters into the student pathway. Training is enhanced via a segmentation-guided objective and synthetic Perlin noise, while a prototype-based task identification mechanism dynamically selects adapters at inference with 99\% accuracy. Experiments on MVTec AD and VisA demonstrate that AdapTS matches the performance of existing TS methods across multi-class and continual learning scenarios, while drastically reducing memory overhead. Our lightest variant, AdapTS-S, requires only 8 MB of additional memory, 13x less than STFPM (95 MB), 48x less than RD4AD (360 MB), and 149x less than DeSTSeg (1120 MB), making it a highly scalable solution for edge deployment in complex industrial environments.
Paper Structure (21 sections, 7 equations, 3 figures, 4 tables)

This paper contains 21 sections, 7 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Comparison of the methods in terms of Image ROC on the MVTec dataset in the Multi-Class scenario, Model Memory (MB) required by each VAD method, and Inference VRAM peak represented by the size of the circles.
  • Figure 2: Overview of AdapTS architecture. (a) Main Block of the method with frozen weights (blue blocks), trainable weights (yellow blocks), and difference feature maps output (blue arrow). (b) During training, synthetic anomalies are introduced alongside a segmentation network to guide learning and separability. (c) In the inference phase, the segmentation network is discarded for efficient anomaly detection.
  • Figure 3: AdapTS anomaly detection examples on the MVTec Dataset. The first row contains an anomalous image per category, the second row the ground-truth anomaly mask and the third row the anomaly heatmap produced by AdapTS.