Table of Contents
Fetching ...

Commonality in Few: Few-Shot Multimodal Anomaly Detection via Hypergraph-Enhanced Memory

Yuxuan Lin, Hanjing Yan, Xuan Tong, Yang Chang, Huanzhen Wang, Ziheng Zhou, Shuyong Gao, Yan Wang, Wenqiang Zhang

TL;DR

This work tackles few-shot multimodal industrial anomaly detection by introducing CIF, a hypergraph-based framework that extracts intra-class structural commonality from limited normal examples. CIF constructs semantic-aware hypergraphs to guide memory-bank formation (SGMS) and employs a training-free bidirectional hypergraph message passing (Bi-TF-MP) to align test features with memory features, followed by hyperedge-guided memory search (HGMS) to reduce false positives. Across MVTec 3D-AD and Eyecandies, CIF achieves state-of-the-art I-AUROC in 1-, 2-, and 4-shot settings, demonstrating strong transferability and robustness with limited data. The approach emphasizes structured, higher-order relationships to improve few-shot detection and localization in industrial contexts, with publicly available code for replication and extension.

Abstract

Few-shot multimodal industrial anomaly detection is a critical yet underexplored task, offering the ability to quickly adapt to complex industrial scenarios. In few-shot settings, insufficient training samples often fail to cover the diverse patterns present in test samples. This challenge can be mitigated by extracting structural commonality from a small number of training samples. In this paper, we propose a novel few-shot unsupervised multimodal industrial anomaly detection method based on structural commonality, CIF (Commonality In Few). To extract intra-class structural information, we employ hypergraphs, which are capable of modeling higher-order correlations, to capture the structural commonality within training samples, and use a memory bank to store this intra-class structural prior. Firstly, we design a semantic-aware hypergraph construction module tailored for single-semantic industrial images, from which we extract common structures to guide the construction of the memory bank. Secondly, we use a training-free hypergraph message passing module to update the visual features of test samples, reducing the distribution gap between test features and features in the memory bank. We further propose a hyperedge-guided memory search module, which utilizes structural information to assist the memory search process and reduce the false positive rate. Experimental results on the MVTec 3D-AD dataset and the Eyecandies dataset show that our method outperforms the state-of-the-art (SOTA) methods in few-shot settings. Code is available at https://github.com/Sunny5250/CIF.

Commonality in Few: Few-Shot Multimodal Anomaly Detection via Hypergraph-Enhanced Memory

TL;DR

This work tackles few-shot multimodal industrial anomaly detection by introducing CIF, a hypergraph-based framework that extracts intra-class structural commonality from limited normal examples. CIF constructs semantic-aware hypergraphs to guide memory-bank formation (SGMS) and employs a training-free bidirectional hypergraph message passing (Bi-TF-MP) to align test features with memory features, followed by hyperedge-guided memory search (HGMS) to reduce false positives. Across MVTec 3D-AD and Eyecandies, CIF achieves state-of-the-art I-AUROC in 1-, 2-, and 4-shot settings, demonstrating strong transferability and robustness with limited data. The approach emphasizes structured, higher-order relationships to improve few-shot detection and localization in industrial contexts, with publicly available code for replication and extension.

Abstract

Few-shot multimodal industrial anomaly detection is a critical yet underexplored task, offering the ability to quickly adapt to complex industrial scenarios. In few-shot settings, insufficient training samples often fail to cover the diverse patterns present in test samples. This challenge can be mitigated by extracting structural commonality from a small number of training samples. In this paper, we propose a novel few-shot unsupervised multimodal industrial anomaly detection method based on structural commonality, CIF (Commonality In Few). To extract intra-class structural information, we employ hypergraphs, which are capable of modeling higher-order correlations, to capture the structural commonality within training samples, and use a memory bank to store this intra-class structural prior. Firstly, we design a semantic-aware hypergraph construction module tailored for single-semantic industrial images, from which we extract common structures to guide the construction of the memory bank. Secondly, we use a training-free hypergraph message passing module to update the visual features of test samples, reducing the distribution gap between test features and features in the memory bank. We further propose a hyperedge-guided memory search module, which utilizes structural information to assist the memory search process and reduce the false positive rate. Experimental results on the MVTec 3D-AD dataset and the Eyecandies dataset show that our method outperforms the state-of-the-art (SOTA) methods in few-shot settings. Code is available at https://github.com/Sunny5250/CIF.

Paper Structure

This paper contains 31 sections, 7 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: The Main Idea of CIF. Different from (a) PatchCore and (b) GraphCore, our HyperAD (c) extracts higher-order correlations among patch features through a hypergraph and uses training-free message passing to obtain patch features enriched with structured contextual information.
  • Figure 2: The pipeline of CIF. Our CIF contains four important parts: (1) Semantic-Aware Hypergraph Construction (SAHC), which constructs hypergraphs for single-semantic samples based on clustering. (2) Structure-Guided Memory Sampling (SGMS), which uses structural commonality in the hypergraph to guide the construction and compression of the memory bank. (3) Bidirectional Training-Free Hypergraph Message Passing (Bi-TF-MP), which performs bidirectional message passing between test samples and the memory bank to reduce their distribution gap. (4) Hyperedge-Guided Memory Search (HGMS), which uses a hyperedge-guided two-stage search to reduce the false positive rate of detection.
  • Figure 3: The pipeline of Hypergraph construction. We use a clustering algorithm to compute the cluster centers among node features, which serve as hyperedge centers. We then calculate the similarity between each hyperedge center and all foreground node features, and determine the hyperedge membership of each node using a predefined threshold, resulting in the hypergraph incidence matrix.
  • Figure 4: Comparison: Greedy Coreset Sampling (Top) vs. Our Sampling (Bottom) at different sampling rates (SR). Red points represent features before sampling, blue points represent features after sampling, and black crosses represent hyperedge features.
  • Figure 5: Visualization of anomaly scores for each category of MVTec 3D-AD and Eyecandies (in multimodal, 1-shot setting).
  • ...and 5 more figures