MVREC: A General Few-shot Defect Classification Model Using Multi-View Region-Context
Shuai Lyu, Rongchen Zhang, Zeqi Ma, Fangjian Liao, Dongmei Mo, Waikeung Wong
TL;DR
This work tackles the poor generalization of few-shot defect multi-classification (FSDMC) across industrial datasets by introducing MVREC, which fuses region-context aware features from AlphaCLIP with multi-view augmentation to preserve contextual information. The framework supports both a training-free Zip-Adapter and a fine-tuned Zip-Adapter-F classifier, and is evaluated on a new MVTec-FS benchmark derived from MVTec AD, comprising 46 defect types under 1-, 3-, and 5-shot settings. Experimental results across MVTec-FS and four other datasets show MVREC achieving state-of-the-art performance in FSDMC, highlighting the value of region-context and multi-view cues for robust defect classification. The proposed approach holds practical significance for industrial quality control, enabling generalizable, context-aware defect recognition with limited labeled samples, and provides a new benchmark for future FSDMC research.
Abstract
Few-shot defect multi-classification (FSDMC) is an emerging trend in quality control within industrial manufacturing. However, current FSDMC research often lacks generalizability due to its focus on specific datasets. Additionally, defect classification heavily relies on contextual information within images, and existing methods fall short of effectively extracting this information. To address these challenges, we propose a general FSDMC framework called MVREC, which offers two primary advantages: (1) MVREC extracts general features for defect instances by incorporating the pre-trained AlphaCLIP model. (2) It utilizes a region-context framework to enhance defect features by leveraging mask region input and multi-view context augmentation. Furthermore, Few-shot Zip-Adapter(-F) classifiers within the model are introduced to cache the visual features of the support set and perform few-shot classification. We also introduce MVTec-FS, a new FSDMC benchmark based on MVTec AD, which includes 1228 defect images with instance-level mask annotations and 46 defect types. Extensive experiments conducted on MVTec-FS and four additional datasets demonstrate its effectiveness in general defect classification and its ability to incorporate contextual information to improve classification performance. Code: https://github.com/ShuaiLYU/MVREC
