Table of Contents
Fetching ...

PaSTe: Improving the Efficiency of Visual Anomaly Detection at the Edge

Manuel Barusco, Francesco Borsatti, Davide Dalle Pezze, Francesco Paissan, Elisabetta Farella, Gian Antonio Susto

TL;DR

This work introduces a novel algorithm, Partially Shared Teacher-student (PaSTe), designed to address the high resource demands of the existing Student Teacher Feature Pyramid Matching (STFPM) approach, and reduces memory and computation requirements, enabling VAD deployment on resource-constrained edge devices.

Abstract

Visual Anomaly Detection (VAD) has gained significant research attention for its ability to identify anomalous images and pinpoint the specific areas responsible for the anomaly. A key advantage of VAD is its unsupervised nature, which eliminates the need for costly and time-consuming labeled data collection. However, despite its potential for real-world applications, the literature has given limited focus to resource-efficient VAD, particularly for deployment on edge devices. This work addresses this gap by leveraging lightweight neural networks to reduce memory and computation requirements, enabling VAD deployment on resource-constrained edge devices. We benchmark the major VAD algorithms within this framework and demonstrate the feasibility of edge-based VAD using the well-known MVTec dataset. Furthermore, we introduce a novel algorithm, Partially Shared Teacher-student (PaSTe), designed to address the high resource demands of the existing Student Teacher Feature Pyramid Matching (STFPM) approach. Our results show that PaSTe decreases the inference time by 25%, while reducing the training time by 33% and peak RAM usage during training by 76%. These improvements make the VAD process significantly more efficient, laying a solid foundation for real-world deployment on edge devices.

PaSTe: Improving the Efficiency of Visual Anomaly Detection at the Edge

TL;DR

This work introduces a novel algorithm, Partially Shared Teacher-student (PaSTe), designed to address the high resource demands of the existing Student Teacher Feature Pyramid Matching (STFPM) approach, and reduces memory and computation requirements, enabling VAD deployment on resource-constrained edge devices.

Abstract

Visual Anomaly Detection (VAD) has gained significant research attention for its ability to identify anomalous images and pinpoint the specific areas responsible for the anomaly. A key advantage of VAD is its unsupervised nature, which eliminates the need for costly and time-consuming labeled data collection. However, despite its potential for real-world applications, the literature has given limited focus to resource-efficient VAD, particularly for deployment on edge devices. This work addresses this gap by leveraging lightweight neural networks to reduce memory and computation requirements, enabling VAD deployment on resource-constrained edge devices. We benchmark the major VAD algorithms within this framework and demonstrate the feasibility of edge-based VAD using the well-known MVTec dataset. Furthermore, we introduce a novel algorithm, Partially Shared Teacher-student (PaSTe), designed to address the high resource demands of the existing Student Teacher Feature Pyramid Matching (STFPM) approach. Our results show that PaSTe decreases the inference time by 25%, while reducing the training time by 33% and peak RAM usage during training by 76%. These improvements make the VAD process significantly more efficient, laying a solid foundation for real-world deployment on edge devices.

Paper Structure

This paper contains 18 sections, 6 figures, 3 tables.

Figures (6)

  • Figure 1: We show on the x-axis the inference time and on the y-axis, the performance. Each color represents a different AD method, while each symbol represents a different tiny backbone. The size represents the total memory required.
  • Figure 2: Image examples from the MVTec Dataset AD. Each object is shown as a normal sample (in green) and an anomalous sample (in red).
  • Figure 3: (a) Representing the scheme for a features-based approach. Each method exploits a feature extractor, and then an AD algorithm uses such representation. The AD algorithm indicates any Features-based method such as PatchCore, Padim, CFA, STFPM, etc. (b) For the edge version, while the AD algorithm remains the same, the feature extractor is changed with a less expensive one, reducing significantly the memory and computation needed.
  • Figure 4: Comparison between their and our approach. It requires memorizing two architectures and performing backward on the entire architecture. It reduces the memory required for and computation resources at the minimum.
  • Figure 5: Overall plot of our benchmark. For every backbone, the four layers groups (L: low, M: middle, H: high, and E: equivalent,(see Tab. \ref{['tab:layer_groups']}) are considered for every category and are represented by a single bar. The height of the bar represents the average F1 pixel level score obtained by the different AD models using that layer group. Since every AD model is different, the color of the bar represents the variance of the F1 pixel level score. The number reported above every histogram is the F1 score of the maximum bar. The final category, named "all", represents the performance of the different layer groups on average in all the categories.
  • ...and 1 more figures