Table of Contents
Fetching ...

Controllable Relation Disentanglement for Few-Shot Class-Incremental Learning

Yuan Zhou, Richang Hong, Yanrong Guo, Lin Liu, Shijie Hao, Hanwang Zhang

TL;DR

This paper proposes a simple yet effective approach, dubbed ConTrollable Relation-disentangLed Few-Shot Class-Incremental Learning (CTRL-FSCIL), to address FSCIL from a new perspective: enhancing FSCIL via disentangling spurious relations between categories.

Abstract

In this paper, we propose to tackle Few-Shot Class-Incremental Learning (FSCIL) from a new perspective, i.e., relation disentanglement, which means enhancing FSCIL via disentangling spurious relation between categories. The challenge of disentangling spurious correlations lies in the poor controllability of FSCIL. On one hand, an FSCIL model is required to be trained in an incremental manner and thus it is very hard to directly control relationships between categories of different sessions. On the other hand, training samples per novel category are only in the few-shot setting, which increases the difficulty of alleviating spurious relation issues as well. To overcome this challenge, in this paper, we propose a new simple-yet-effective method, called ConTrollable Relation-disentangLed Few-Shot Class-Incremental Learning (CTRL-FSCIL). Specifically, during the base session, we propose to anchor base category embeddings in feature space and construct disentanglement proxies to bridge gaps between the learning for category representations in different sessions, thereby making category relation controllable. During incremental learning, the parameters of the backbone network are frozen in order to relieve the negative impact of data scarcity. Moreover, a disentanglement loss is designed to effectively guide a relation disentanglement controller to disentangle spurious correlations between the embeddings encoded by the backbone. In this way, the spurious correlation issue in FSCIL can be suppressed. Extensive experiments on CIFAR-100, mini-ImageNet, and CUB-200 datasets demonstrate the effectiveness of our CTRL-FSCIL method.

Controllable Relation Disentanglement for Few-Shot Class-Incremental Learning

TL;DR

This paper proposes a simple yet effective approach, dubbed ConTrollable Relation-disentangLed Few-Shot Class-Incremental Learning (CTRL-FSCIL), to address FSCIL from a new perspective: enhancing FSCIL via disentangling spurious relations between categories.

Abstract

In this paper, we propose to tackle Few-Shot Class-Incremental Learning (FSCIL) from a new perspective, i.e., relation disentanglement, which means enhancing FSCIL via disentangling spurious relation between categories. The challenge of disentangling spurious correlations lies in the poor controllability of FSCIL. On one hand, an FSCIL model is required to be trained in an incremental manner and thus it is very hard to directly control relationships between categories of different sessions. On the other hand, training samples per novel category are only in the few-shot setting, which increases the difficulty of alleviating spurious relation issues as well. To overcome this challenge, in this paper, we propose a new simple-yet-effective method, called ConTrollable Relation-disentangLed Few-Shot Class-Incremental Learning (CTRL-FSCIL). Specifically, during the base session, we propose to anchor base category embeddings in feature space and construct disentanglement proxies to bridge gaps between the learning for category representations in different sessions, thereby making category relation controllable. During incremental learning, the parameters of the backbone network are frozen in order to relieve the negative impact of data scarcity. Moreover, a disentanglement loss is designed to effectively guide a relation disentanglement controller to disentangle spurious correlations between the embeddings encoded by the backbone. In this way, the spurious correlation issue in FSCIL can be suppressed. Extensive experiments on CIFAR-100, mini-ImageNet, and CUB-200 datasets demonstrate the effectiveness of our CTRL-FSCIL method.
Paper Structure (19 sections, 9 equations, 8 figures, 3 tables)

This paper contains 19 sections, 9 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Typical examples for spurious relation issues in few-shot class-incremental learning, which are obtained from the last incremental session of the CIFAR-100 dataset. The values in the matrixes indicate correlations between the learned global embeddings of categories. Also, "base$\leftrightharpoons$novel" represents spurious relation issues between base and novel categories, while "novel$\leftrightharpoons$novel" indicates spurious relation issues between novel categories.
  • Figure 2: An illustration of our ConTrollable Relation-disentangLed Few-Shot Class-Incremental Learning (CTRL-FSCIL) method, which is examplified in an incremental session.
  • Figure 3: The influencne of the Orthogonal Proxy Anchoring (OPA) strategy for the model. In the above figures, "w/ OPA" or "w/o OPA" indicates whether or not the OPA strategy is utilized in the first phase. "Base Acc", "Novel Acc", and "Acc" represent the accuracy on base categories, novel categories, and both of them respectively.
  • Figure 4: The influencne of the Disentanglement Proxy Discriminability Boosting (DPDB) strategy for our method. In the above figures, "w/ DPDB" represents that the full DPDB strategy is utilized in the first phase. "w/o NN" indicates that the discriminability between disentanglement proxies is not ensured (i.e., $\mathcal{L}_{novel-novel}$ is removed from Eq. 6). "w/o NN+BN" indicates that the discriminability of disentanglement proxies to base categories is not considered further (i.e., $\mathcal{L}_{novel-novel}$ and $\mathcal{L}_{base-novel}$ are both removed).
  • Figure 5: The influence of directly applying the DPDB strategy to the embeddings of base categories. In "w/ DPDB-Direct", DPDB is applied to the representations of base categories directly, i.e., $\mathcal{L}_{base-novel}$ minimizes the similarity between base class embeddings and disentanglement proxies. In "w/ DPDB", $\mathcal{L}_{base-novel}$ is utilized to suppress the correlations between disentanglement proxies and pre-defined orthogonal proxies as base class embeddings are anchored by orthogonal proxies in feature space. In the tSNE visualization, each colored marker denotes the learned mean embeddings of a base category.
  • ...and 3 more figures