Table of Contents
Fetching ...

Small Object Few-shot Segmentation for Vision-based Industrial Inspection

Zilong Zhang, Chang Niu, Zhibin Zhao, Xingwu Zhang, Xuefeng Chen

TL;DR

This work addresses the challenge of locating ultra-small defects in vision-based industrial inspection where traditional close-set segmentation and generic anomaly detection fall short. It introduces SOFS, a small object–focused few-shot segmentation framework that preserves target semantics with a non-resizing training strategy, prototype intensity downsampling, and an abnormal prior to curb background false positives, supplemented by a mixed normal Dice loss. The approach achieves state-of-the-art performance on VII benchmarks for both few-shot semantic segmentation and few-shot anomaly detection, and demonstrates robustness under domain shifts. The method offers practical value for rapid, label-efficient defect localization in industrial settings, reducing the need for extensive defect collections and enabling targeted defect localization without retraining.

Abstract

Vision-based industrial inspection (VII) aims to locate defects quickly and accurately. Supervised learning under a close-set setting and industrial anomaly detection, as two common paradigms in VII, face different problems in practical applications. The former is that various and sufficient defects are difficult to obtain, while the latter is that specific defects cannot be located. To solve these problems, in this paper, we focus on the few-shot semantic segmentation (FSS) method, which can locate unseen defects conditioned on a few annotations without retraining. Compared to common objects in natural images, the defects in VII are small. This brings two problems to current FSS methods: 1 distortion of target semantics and 2 many false positives for backgrounds. To alleviate these problems, we propose a small object few-shot segmentation (SOFS) model. The key idea for alleviating 1 is to avoid the resizing of the original image and correctly indicate the intensity of target semantics. SOFS achieves this idea via the non-resizing procedure and the prototype intensity downsampling of support annotations. To alleviate 2, we design an abnormal prior map in SOFS to guide the model to reduce false positives and propose a mixed normal Dice loss to preferentially prevent the model from predicting false positives. SOFS can achieve FSS and few-shot anomaly detection determined by support masks. Diverse experiments substantiate the superior performance of SOFS. Code is available at https://github.com/zhangzilongc/SOFS.

Small Object Few-shot Segmentation for Vision-based Industrial Inspection

TL;DR

This work addresses the challenge of locating ultra-small defects in vision-based industrial inspection where traditional close-set segmentation and generic anomaly detection fall short. It introduces SOFS, a small object–focused few-shot segmentation framework that preserves target semantics with a non-resizing training strategy, prototype intensity downsampling, and an abnormal prior to curb background false positives, supplemented by a mixed normal Dice loss. The approach achieves state-of-the-art performance on VII benchmarks for both few-shot semantic segmentation and few-shot anomaly detection, and demonstrates robustness under domain shifts. The method offers practical value for rapid, label-efficient defect localization in industrial settings, reducing the need for extensive defect collections and enabling targeted defect localization without retraining.

Abstract

Vision-based industrial inspection (VII) aims to locate defects quickly and accurately. Supervised learning under a close-set setting and industrial anomaly detection, as two common paradigms in VII, face different problems in practical applications. The former is that various and sufficient defects are difficult to obtain, while the latter is that specific defects cannot be located. To solve these problems, in this paper, we focus on the few-shot semantic segmentation (FSS) method, which can locate unseen defects conditioned on a few annotations without retraining. Compared to common objects in natural images, the defects in VII are small. This brings two problems to current FSS methods: 1 distortion of target semantics and 2 many false positives for backgrounds. To alleviate these problems, we propose a small object few-shot segmentation (SOFS) model. The key idea for alleviating 1 is to avoid the resizing of the original image and correctly indicate the intensity of target semantics. SOFS achieves this idea via the non-resizing procedure and the prototype intensity downsampling of support annotations. To alleviate 2, we design an abnormal prior map in SOFS to guide the model to reduce false positives and propose a mixed normal Dice loss to preferentially prevent the model from predicting false positives. SOFS can achieve FSS and few-shot anomaly detection determined by support masks. Diverse experiments substantiate the superior performance of SOFS. Code is available at https://github.com/zhangzilongc/SOFS.
Paper Structure (25 sections, 6 equations, 17 figures, 10 tables)

This paper contains 25 sections, 6 equations, 17 figures, 10 tables.

Figures (17)

  • Figure 1: Small defects predictions of different models conditioned on the same support image. The red solid line box indicates true labels, and the yellow dashed line box indicates predictions. Due to the small defects, please use the electronic version to enlarge defects for easier viewing.
  • Figure 2: Proposed small object few-shot segmentation model. $\hat{\textbf{M}}^{\rm{s}}$ indicates the downsampling of support mask. $\textbf{M}^{\rm{s}}_{\rm{a}}, \textbf{M}^{\rm{s}}_{\rm{s}}$ denote the abnormal prior map and semantic prior map respectively, $\textbf{p}$ denotes a prototype feature. Best viewed in the electronic version.
  • Figure 3: Different downsamplings for the small object mask. Each patch on the patch-wise original mask represents a region that is highly correlated with corresponding features on the feature map.
  • Figure 4: The results of different $\eta$ and $\alpha$ in the mixed normal Dice loss.
  • Figure 5: The results as the test input size evolves in the sliding window mechanism.
  • ...and 12 more figures