Table of Contents
Fetching ...

Decoupling Classifier for Boosting Few-shot Object Detection and Instance Segmentation

Bin-Bin Gao, Xiaochen Chen, Zhongyi Huang, Congchong Nie, Jun Liu, Jinxiang Lai, Guannan Jiang, Xi Wang, Chengjie Wang

TL;DR

This work addresses the missing-label issue in instance-level few-shot detection and segmentation, which induces biased foreground–background classification. It introduces a simple decoupled classifier with two parallel heads, forming $L_{CLS} = L_{CLS}^{fg} + L_{CLS}^{bg}$, where the negative head uses a constrained logits construction $\bar{x}_i = m_i x_i$ and probabilities $\bar{p}_i$ to mitigate mislabel-driven bias. By applying this approach to FSOD/FSIS and generalized FSOD/FSIS on PASCAL VOC and MS-COCO, the method yields significant performance gains without adding parameters or computation, outperforming strong baselines such as DeFRCN. The results demonstrate the method’s robustness across low-shot regimes and its applicability to both detection and instance segmentation, with code released for reproducibility. Overall, the paper advances a practical, theory-grounded strategy for coping with incomplete labels in few-shot vision tasks.

Abstract

This paper focus on few-shot object detection~(FSOD) and instance segmentation~(FSIS), which requires a model to quickly adapt to novel classes with a few labeled instances. The existing methods severely suffer from bias classification because of the missing label issue which naturally exists in an instance-level few-shot scenario and is first formally proposed by us. Our analysis suggests that the standard classification head of most FSOD or FSIS models needs to be decoupled to mitigate the bias classification. Therefore, we propose an embarrassingly simple but effective method that decouples the standard classifier into two heads. Then, these two individual heads are capable of independently addressing clear positive samples and noisy negative samples which are caused by the missing label. In this way, the model can effectively learn novel classes while mitigating the effects of noisy negative samples. Without bells and whistles, our model without any additional computation cost and parameters consistently outperforms its baseline and state-of-the-art by a large margin on PASCAL VOC and MS-COCO benchmarks for FSOD and FSIS tasks. The Code is available at https://csgaobb.github.io/Projects/DCFS.

Decoupling Classifier for Boosting Few-shot Object Detection and Instance Segmentation

TL;DR

This work addresses the missing-label issue in instance-level few-shot detection and segmentation, which induces biased foreground–background classification. It introduces a simple decoupled classifier with two parallel heads, forming , where the negative head uses a constrained logits construction and probabilities to mitigate mislabel-driven bias. By applying this approach to FSOD/FSIS and generalized FSOD/FSIS on PASCAL VOC and MS-COCO, the method yields significant performance gains without adding parameters or computation, outperforming strong baselines such as DeFRCN. The results demonstrate the method’s robustness across low-shot regimes and its applicability to both detection and instance segmentation, with code released for reproducibility. Overall, the paper advances a practical, theory-grounded strategy for coping with incomplete labels in few-shot vision tasks.

Abstract

This paper focus on few-shot object detection~(FSOD) and instance segmentation~(FSIS), which requires a model to quickly adapt to novel classes with a few labeled instances. The existing methods severely suffer from bias classification because of the missing label issue which naturally exists in an instance-level few-shot scenario and is first formally proposed by us. Our analysis suggests that the standard classification head of most FSOD or FSIS models needs to be decoupled to mitigate the bias classification. Therefore, we propose an embarrassingly simple but effective method that decouples the standard classifier into two heads. Then, these two individual heads are capable of independently addressing clear positive samples and noisy negative samples which are caused by the missing label. In this way, the model can effectively learn novel classes while mitigating the effects of noisy negative samples. Without bells and whistles, our model without any additional computation cost and parameters consistently outperforms its baseline and state-of-the-art by a large margin on PASCAL VOC and MS-COCO benchmarks for FSOD and FSIS tasks. The Code is available at https://csgaobb.github.io/Projects/DCFS.

Paper Structure

This paper contains 16 sections, 13 equations, 8 figures, 11 tables.

Figures (8)

  • Figure 1: The proportion of missing instances in the training set for FSOD and gFSOD on (a) PASCAL-VOC and (b) MS-COCO datasets. It can be observed that there is a high missing rate in each shot, especially for the gFSOD. In (c), two "person" instances present in the one-shot labeled image, but they are mislabeled.
  • Figure 2: Illustration of the gradient of decoupling classifier, where the blue arrow represents the gradient direction. (a) illustrates the gradient propagation on the positive head, and (b) reveals that the gradient propagation is constrained between few-shot labeled class (e.g., dog) and the background and thus the bias classification is mitigated. Best viewed in color and zoom in.
  • Figure 3: Comparison on $\rm{mRecall}$ and $\rm{Recall}$ of the proposed decoupling classifier (DC) and standard classification head (CE) under FSIS and gFSIS settings. The mean and standard deviation results are computed on all 10 seeds for each shot. Best viewed in color and zoom in.
  • Figure 4: Visualization results of our method and the strong baseline (Mask-DeFRCN) on MS-COCO validation images. Best viewed in color and zoom in.
  • Figure 5: Comparisons of the proportion of missing labeled instances of FSOD/FSIS and gFSOD/gFSIS on the MS-COCO dataset. We can see that there are high proportions almost on all shots using different seeds except the seed0; and the gFSOD/gFSIS setting generally has higher missing rates than that of FSOD/FSIS.
  • ...and 3 more figures