Table of Contents
Fetching ...

Unified Anomaly Detection methods on Edge Device using Knowledge Distillation and Quantization

Sushovan Jena, Arya Pulkit, Kajal Singh, Anoushka Banerjee, Sharad Joshi, Ananth Ganesh, Dinesh Singh, Arnav Bhavsar

TL;DR

This work tackles the practical problem of anomaly detection in manufacturing by evaluating unified multi-class models suitable for edge deployment. It compares three lightweight KD-based methods—Uninformed Students, Reverse Distillation, and STFPM—across FP32 and quantized regimes, using PTQ and QAT on CPU and NVIDIA Jetson Xavier NX. Key findings show that RD and STFPM generalize well across 15 classes, achieving AUROC close to their one-class counterparts while offering lower latency, especially STFPM. Quantization, particularly TensorRT-based PTQ and QAT, substantially reduces model size and inference time, with random-normal calibration providing notable gains for unsupervised tasks and QAT recovering much of the FP32 performance, bringing INT8 models near FP32 levels for the strong methods.

Abstract

With the rapid advances in deep learning and smart manufacturing in Industry 4.0, there is an imperative for high-throughput, high-performance, and fully integrated visual inspection systems. Most anomaly detection approaches using defect detection datasets, such as MVTec AD, employ one-class models that require fitting separate models for each class. On the contrary, unified models eliminate the need for fitting separate models for each class and significantly reduce cost and memory requirements. Thus, in this work, we experiment with considering a unified multi-class setup. Our experimental study shows that multi-class models perform at par with one-class models for the standard MVTec AD dataset. Hence, this indicates that there may not be a need to learn separate object/class-wise models when the object classes are significantly different from each other, as is the case of the dataset considered. Furthermore, we have deployed three different unified lightweight architectures on the CPU and an edge device (NVIDIA Jetson Xavier NX). We analyze the quantized multi-class anomaly detection models in terms of latency and memory requirements for deployment on the edge device while comparing quantization-aware training (QAT) and post-training quantization (PTQ) for performance at different precision widths. In addition, we explored two different methods of calibration required in post-training scenarios and show that one of them performs notably better, highlighting its importance for unsupervised tasks. Due to quantization, the performance drop in PTQ is further compensated by QAT, which yields at par performance with the original 32-bit Floating point in two of the models considered.

Unified Anomaly Detection methods on Edge Device using Knowledge Distillation and Quantization

TL;DR

This work tackles the practical problem of anomaly detection in manufacturing by evaluating unified multi-class models suitable for edge deployment. It compares three lightweight KD-based methods—Uninformed Students, Reverse Distillation, and STFPM—across FP32 and quantized regimes, using PTQ and QAT on CPU and NVIDIA Jetson Xavier NX. Key findings show that RD and STFPM generalize well across 15 classes, achieving AUROC close to their one-class counterparts while offering lower latency, especially STFPM. Quantization, particularly TensorRT-based PTQ and QAT, substantially reduces model size and inference time, with random-normal calibration providing notable gains for unsupervised tasks and QAT recovering much of the FP32 performance, bringing INT8 models near FP32 levels for the strong methods.

Abstract

With the rapid advances in deep learning and smart manufacturing in Industry 4.0, there is an imperative for high-throughput, high-performance, and fully integrated visual inspection systems. Most anomaly detection approaches using defect detection datasets, such as MVTec AD, employ one-class models that require fitting separate models for each class. On the contrary, unified models eliminate the need for fitting separate models for each class and significantly reduce cost and memory requirements. Thus, in this work, we experiment with considering a unified multi-class setup. Our experimental study shows that multi-class models perform at par with one-class models for the standard MVTec AD dataset. Hence, this indicates that there may not be a need to learn separate object/class-wise models when the object classes are significantly different from each other, as is the case of the dataset considered. Furthermore, we have deployed three different unified lightweight architectures on the CPU and an edge device (NVIDIA Jetson Xavier NX). We analyze the quantized multi-class anomaly detection models in terms of latency and memory requirements for deployment on the edge device while comparing quantization-aware training (QAT) and post-training quantization (PTQ) for performance at different precision widths. In addition, we explored two different methods of calibration required in post-training scenarios and show that one of them performs notably better, highlighting its importance for unsupervised tasks. Due to quantization, the performance drop in PTQ is further compensated by QAT, which yields at par performance with the original 32-bit Floating point in two of the models considered.
Paper Structure (24 sections, 2 figures, 7 tables)

This paper contains 24 sections, 2 figures, 7 tables.

Figures (2)

  • Figure 3: Graphical comparison of class-wise AUROC of FP-32, FP-16 and INT 8 models on Nvidia Jetson NX
  • Figure 4: Anomaly map visualization of TRT FP-16 (1st row) vs INT-8 (2nd row) results of STFPM on NVIDIA Jetson NX, where the first column is the original image, 2nd column is the corresponding ground truth mask and 3rd column represents the Anomaly map superimposed on the Original Image.