Table of Contents
Fetching ...

Class Similarity Transition: Decoupling Class Similarities and Imbalance from Generalized Few-shot Segmentation

Shihong Wang, Ruixun Liu, Kaiyu Li, Jiawei Jiang, Xiangyong Cao

TL;DR

This paper proposes a similarity transition matrix to guide the learning of novel classes with base class knowledge and extends the probability transition matrix to address the problem of class imbalance as well as overfitting the support set.

Abstract

In Generalized Few-shot Segmentation (GFSS), a model is trained with a large corpus of base class samples and then adapted on limited samples of novel classes. This paper focuses on the relevance between base and novel classes, and improves GFSS in two aspects: 1) mining the similarity between base and novel classes to promote the learning of novel classes, and 2) mitigating the class imbalance issue caused by the volume difference between the support set and the training set. Specifically, we first propose a similarity transition matrix to guide the learning of novel classes with base class knowledge. Then, we leverage the Label-Distribution-Aware Margin (LDAM) loss and Transductive Inference to the GFSS task to address the problem of class imbalance as well as overfitting the support set. In addition, by extending the probability transition matrix, the proposed method can mitigate the catastrophic forgetting of base classes when learning novel classes. With a simple training phase, our proposed method can be applied to any segmentation network trained on base classes. We validated our methods on the adapted version of OpenEarthMap. Compared to existing GFSS baselines, our method excels them all from 3% to 7% and ranks second in the OpenEarthMap Land Cover Mapping Few-Shot Challenge at the completion of this paper. Code: https://github.com/earth-insights/ClassTrans

Class Similarity Transition: Decoupling Class Similarities and Imbalance from Generalized Few-shot Segmentation

TL;DR

This paper proposes a similarity transition matrix to guide the learning of novel classes with base class knowledge and extends the probability transition matrix to address the problem of class imbalance as well as overfitting the support set.

Abstract

In Generalized Few-shot Segmentation (GFSS), a model is trained with a large corpus of base class samples and then adapted on limited samples of novel classes. This paper focuses on the relevance between base and novel classes, and improves GFSS in two aspects: 1) mining the similarity between base and novel classes to promote the learning of novel classes, and 2) mitigating the class imbalance issue caused by the volume difference between the support set and the training set. Specifically, we first propose a similarity transition matrix to guide the learning of novel classes with base class knowledge. Then, we leverage the Label-Distribution-Aware Margin (LDAM) loss and Transductive Inference to the GFSS task to address the problem of class imbalance as well as overfitting the support set. In addition, by extending the probability transition matrix, the proposed method can mitigate the catastrophic forgetting of base classes when learning novel classes. With a simple training phase, our proposed method can be applied to any segmentation network trained on base classes. We validated our methods on the adapted version of OpenEarthMap. Compared to existing GFSS baselines, our method excels them all from 3% to 7% and ranks second in the OpenEarthMap Land Cover Mapping Few-Shot Challenge at the completion of this paper. Code: https://github.com/earth-insights/ClassTrans
Paper Structure (24 sections, 14 equations, 5 figures, 3 tables)

This paper contains 24 sections, 14 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The motivation of our proposed method and the experiment result.A) The top left figure reveals that datasets from the real world exhibit long-tailed distribution. i.e. severe class imbalance. The dataset shown here is from datasetsoem whose adaptation is used in the challenge. B) The top right figure shows the class relevance by counting the novel class pixels from the support set misclassified by the model after the training phase. The wider and deeper the bond is, the more pixels are wrongly classified as base classes. C) The bottom figure shows the effectiveness of our proposed components in tackling class imbalance and class similarity problems.
  • Figure 2: Pipeline and our proposed framework.$\otimes$ refers to matrix product operation, $\oplus$ refers to element-wise addition. The pipeline consists of two phases: the training phase and the few-shot learning Phase. A) First, the training phase learns a shared feature extractor $f_\theta$ and classification weights $W^b_t$ for the base classes, as shown in the figure. Besides, novel classes are treated as background during the training phase. B) Then, the few-shot learning phase learns a MLP that predicts the transition matrix and classification weights $W^b_f, W^n_f$ for the base classes and novel classes respectively. C) With $W^b_t, W^b_f, W^n_f$, and MLP learned, our proposed method gives the transition and classification logits from Transition Branch and Classification Branch respectively. By merging the transition logits and classification logits, the model gives the final predictions of both base and novel classes on a given image.
  • Figure 3: Verifying the existence of Catastrophic Forgetting. G.Truth refers to the ground truth segmentation mask of a given image. "w/o Distillation" represents training DIaM diam by removing Knowledge Distillation. "DIaM" represents our adaptation of DIaM diam on our dataset.
  • Figure 4: The quantitative result of $\mathcal{L}_\pi$.$\lambda$ represents the weight of $\mathcal{L}_\pi$ in the final objective function. a)$\lambda=0$ indicates the $\mathcal{L}_\pi$ is removed. b,c)$\lambda=1$ and $\lambda=4$ indicates the weight of $\mathcal{L}_\pi$ is set to $1$ and $4$ respectively.
  • Figure 5: The heat map of the Transition Matrix after the few-shot learning phase.