Table of Contents
Fetching ...

Organizing Background to Explore Latent Classes for Incremental Few-shot Semantic Segmentation

Lianlei Shan, Wenzhang Zhou, Wei Li, Xingyu Ding

TL;DR

This work tackles incremental Few-shot Semantic Segmentation (iFSS) by introducing OINet, which explicitly reserves embedding-space capacity for novel classes during base training through multiple background prototypes. It then enables novel classes to inherit embedding space from selected prototypes using a KM-based prototype matching and a dynamic weight imprint scheme, preserving the old distribution while enabling rapid learning from few examples. The approach is guided by dispersion and compactness objectives for background prototypes and includes an inheritance loss to align novel-class heads with prototype space. Experiments on Pascal-VOC and COCO demonstrate state-of-the-art performance across 1/2/5-shot settings, highlighting the practical impact of background-space organization for robust, memory-efficient iFSS.

Abstract

The goal of incremental Few-shot Semantic Segmentation (iFSS) is to extend pre-trained segmentation models to new classes via few annotated images without access to old training data. During incrementally learning novel classes, the data distribution of old classes will be destroyed, leading to catastrophic forgetting. Meanwhile, the novel classes have only few samples, making models impossible to learn the satisfying representations of novel classes. For the iFSS problem, we propose a network called OINet, i.e., the background embedding space \textbf{O}rganization and prototype \textbf{I}nherit Network. Specifically, when training base classes, OINet uses multiple classification heads for the background and sets multiple sub-class prototypes to reserve embedding space for the latent novel classes. During incrementally learning novel classes, we propose a strategy to select the sub-class prototypes that best match the current learning novel classes and make the novel classes inherit the selected prototypes' embedding space. This operation allows the novel classes to be registered in the embedding space using few samples without affecting the distribution of the base classes. Results on Pascal-VOC and COCO show that OINet achieves a new state of the art.

Organizing Background to Explore Latent Classes for Incremental Few-shot Semantic Segmentation

TL;DR

This work tackles incremental Few-shot Semantic Segmentation (iFSS) by introducing OINet, which explicitly reserves embedding-space capacity for novel classes during base training through multiple background prototypes. It then enables novel classes to inherit embedding space from selected prototypes using a KM-based prototype matching and a dynamic weight imprint scheme, preserving the old distribution while enabling rapid learning from few examples. The approach is guided by dispersion and compactness objectives for background prototypes and includes an inheritance loss to align novel-class heads with prototype space. Experiments on Pascal-VOC and COCO demonstrate state-of-the-art performance across 1/2/5-shot settings, highlighting the practical impact of background-space organization for robust, memory-efficient iFSS.

Abstract

The goal of incremental Few-shot Semantic Segmentation (iFSS) is to extend pre-trained segmentation models to new classes via few annotated images without access to old training data. During incrementally learning novel classes, the data distribution of old classes will be destroyed, leading to catastrophic forgetting. Meanwhile, the novel classes have only few samples, making models impossible to learn the satisfying representations of novel classes. For the iFSS problem, we propose a network called OINet, i.e., the background embedding space \textbf{O}rganization and prototype \textbf{I}nherit Network. Specifically, when training base classes, OINet uses multiple classification heads for the background and sets multiple sub-class prototypes to reserve embedding space for the latent novel classes. During incrementally learning novel classes, we propose a strategy to select the sub-class prototypes that best match the current learning novel classes and make the novel classes inherit the selected prototypes' embedding space. This operation allows the novel classes to be registered in the embedding space using few samples without affecting the distribution of the base classes. Results on Pascal-VOC and COCO show that OINet achieves a new state of the art.
Paper Structure (18 sections, 7 equations, 2 figures, 6 tables)

This paper contains 18 sections, 7 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: Illustration of the proposed method. (a) denotes the learning of base classes. Multiple sub-class prototypes are set for the background. Through embedding space organization, the sub-class prototypes and base classes are dispersed from each other. (b) represents the incremental learning of novel classes. First, the prototypes that best match the novel classes are selected from the sub-class prototypes of the background, and then novel classes are made to inherit the embedding spaces of the selected sub-class prototypes. Therefore, the whole process not only ensures that the novel classes can be learned (i.e., registered to the embedding space) but also makes the learning of novel classes not affect the distribution of base classes.
  • Figure 2: The overall structure of the proposed method. The overall process can be divided into the stage of learning base classes, as shown in (a), and the stage of incrementally learning novel classes, as shown in (b). The colors in the figure are consistent with those in Figure \ref{['ifss_intro']}, i.e., orange and yellow represent the base classes, gray represents the background, and green represents the currently learning novel class. In base training, the difference is there are multiple classification heads for background (like $K$), and the weights of these classification heads are obtained from the features of the background through the k-cluster. In incremental novel training, the weights of the segmentation heads of the novel classes are calculated through the training data of the novel classes and the sub-class prototypes obtained during base training. DWI denotes dynamic weight imprinting.