Table of Contents
Fetching ...

Pyramid-based Mamba Multi-class Unsupervised Anomaly Detection

Nasar Iqbal, Niki Martinel

TL;DR

This work addresses multi-class unsupervised anomaly detection and localization, focusing on accurately localizing small defects while maintaining computational efficiency. It introduces Pyramid-based Mamba, which combines a pre-trained encoder, a synthetic anomaly generator, and a Context-aware State Space (CSS) module that fuses global long-range dependencies from state-space models with local spatial cues via Locality-Enhanced Convolutional blocks. The core innovations are the Pyramidal Scanning Strategy (PSS), enabling multi-scale, pyramid-aware processing, and the CSS module that integrates PSS with multi-kernel convolutions to capture both global and local context. Extensive experiments on the MVTec-AD dataset show competitive or superior performance across image- and pixel-level metrics, with notable improvements in AP and AU-PRO and strong localization accuracy across diverse industrial categories. The approach offers a scalable, robust solution for industrial anomaly detection and localization, enabling precise defect localization while preserving efficiency.

Abstract

Recent advances in convolutional neural networks (CNNs) and transformer-based methods have improved anomaly detection and localization, but challenges persist in precisely localizing small anomalies. While CNNs face limitations in capturing long-range dependencies, transformer architectures often suffer from substantial computational overheads. We introduce a state space model (SSM)-based Pyramidal Scanning Strategy (PSS) for multi-class anomaly detection and localization--a novel approach designed to address the challenge of small anomaly localization. Our method captures fine-grained details at multiple scales by integrating the PSS with a pre-trained encoder for multi-scale feature extraction and a feature-level synthetic anomaly generator. An improvement of $+1\%$ AP for multi-class anomaly localization and a +$1\%$ increase in AU-PRO on MVTec benchmark demonstrate our method's superiority in precise anomaly localization across diverse industrial scenarios. The code is available at https://github.com/iqbalmlpuniud/Pyramid Mamba.

Pyramid-based Mamba Multi-class Unsupervised Anomaly Detection

TL;DR

This work addresses multi-class unsupervised anomaly detection and localization, focusing on accurately localizing small defects while maintaining computational efficiency. It introduces Pyramid-based Mamba, which combines a pre-trained encoder, a synthetic anomaly generator, and a Context-aware State Space (CSS) module that fuses global long-range dependencies from state-space models with local spatial cues via Locality-Enhanced Convolutional blocks. The core innovations are the Pyramidal Scanning Strategy (PSS), enabling multi-scale, pyramid-aware processing, and the CSS module that integrates PSS with multi-kernel convolutions to capture both global and local context. Extensive experiments on the MVTec-AD dataset show competitive or superior performance across image- and pixel-level metrics, with notable improvements in AP and AU-PRO and strong localization accuracy across diverse industrial categories. The approach offers a scalable, robust solution for industrial anomaly detection and localization, enabling precise defect localization while preserving efficiency.

Abstract

Recent advances in convolutional neural networks (CNNs) and transformer-based methods have improved anomaly detection and localization, but challenges persist in precisely localizing small anomalies. While CNNs face limitations in capturing long-range dependencies, transformer architectures often suffer from substantial computational overheads. We introduce a state space model (SSM)-based Pyramidal Scanning Strategy (PSS) for multi-class anomaly detection and localization--a novel approach designed to address the challenge of small anomaly localization. Our method captures fine-grained details at multiple scales by integrating the PSS with a pre-trained encoder for multi-scale feature extraction and a feature-level synthetic anomaly generator. An improvement of AP for multi-class anomaly localization and a + increase in AU-PRO on MVTec benchmark demonstrate our method's superiority in precise anomaly localization across diverse industrial scenarios. The code is available at https://github.com/iqbalmlpuniud/Pyramid Mamba.

Paper Structure

This paper contains 15 sections, 8 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Overview of the proposed Pyramid-Mamba approach consisting of a teacher-student network to reconstruct multi-scale synthetic anomalous features. Each Context-aware State Space (CSS) module consists of Pyramidal Scanning Strategy (PSS) to capture local and global interaction and parallel multi-kernel convolution operations to capture local information. An anomaly map is the sum of multi-scale reconstruction errors.
  • Figure 2: The proposed pyramid-based scanning involves applying a Selective Scan Module (SSM) to the entire image, then dividing it into four primary patches, and each processed separately by SSM. Each primary patch is further subdivided into four sub-patches and each is processed by SSM separately.
  • Figure 3: Qualitative visualization for pixel-level anomaly segmentation on MVTec dataset.
  • Figure 4: Ablation studies on multiclass anomaly localization for different pyramid levels. Results are on the MVTec dataset.