Table of Contents
Fetching ...

DebGCD: Debiased Learning with Distribution Guidance for Generalized Category Discovery

Yuanpei Liu, Kai Han

TL;DR

DebGCD is introduced, a debiased learning with distribution guidance framework for Generalized Category Discovery that co-trains an auxiliary debiased classifier in the same feature space as the GCD classifier, progressively enhancing the GCD features.

Abstract

In this paper, we tackle the problem of Generalized Category Discovery (GCD). Given a dataset containing both labelled and unlabelled images, the objective is to categorize all images in the unlabelled subset, irrespective of whether they are from known or unknown classes. In GCD, an inherent label bias exists between known and unknown classes due to the lack of ground-truth labels for the latter. State-of-the-art methods in GCD leverage parametric classifiers trained through self-distillation with soft labels, leaving the bias issue unattended. Besides, they treat all unlabelled samples uniformly, neglecting variations in certainty levels and resulting in suboptimal learning. Moreover, the explicit identification of semantic distribution shifts between known and unknown classes, a vital aspect for effective GCD, has been neglected. To address these challenges, we introduce DebGCD, a \underline{Deb}iased learning with distribution guidance framework for \underline{GCD}. Initially, DebGCD co-trains an auxiliary debiased classifier in the same feature space as the GCD classifier, progressively enhancing the GCD features. Moreover, we introduce a semantic distribution detector in a separate feature space to implicitly boost the learning efficacy of GCD. Additionally, we employ a curriculum learning strategy based on semantic distribution certainty to steer the debiased learning at an optimized pace. Thorough evaluations on GCD benchmarks demonstrate the consistent state-of-the-art performance of our framework, highlighting its superiority. Project page: https://visual-ai.github.io/debgcd/

DebGCD: Debiased Learning with Distribution Guidance for Generalized Category Discovery

TL;DR

DebGCD is introduced, a debiased learning with distribution guidance framework for Generalized Category Discovery that co-trains an auxiliary debiased classifier in the same feature space as the GCD classifier, progressively enhancing the GCD features.

Abstract

In this paper, we tackle the problem of Generalized Category Discovery (GCD). Given a dataset containing both labelled and unlabelled images, the objective is to categorize all images in the unlabelled subset, irrespective of whether they are from known or unknown classes. In GCD, an inherent label bias exists between known and unknown classes due to the lack of ground-truth labels for the latter. State-of-the-art methods in GCD leverage parametric classifiers trained through self-distillation with soft labels, leaving the bias issue unattended. Besides, they treat all unlabelled samples uniformly, neglecting variations in certainty levels and resulting in suboptimal learning. Moreover, the explicit identification of semantic distribution shifts between known and unknown classes, a vital aspect for effective GCD, has been neglected. To address these challenges, we introduce DebGCD, a \underline{Deb}iased learning with distribution guidance framework for \underline{GCD}. Initially, DebGCD co-trains an auxiliary debiased classifier in the same feature space as the GCD classifier, progressively enhancing the GCD features. Moreover, we introduce a semantic distribution detector in a separate feature space to implicitly boost the learning efficacy of GCD. Additionally, we employ a curriculum learning strategy based on semantic distribution certainty to steer the debiased learning at an optimized pace. Thorough evaluations on GCD benchmarks demonstrate the consistent state-of-the-art performance of our framework, highlighting its superiority. Project page: https://visual-ai.github.io/debgcd/

Paper Structure

This paper contains 28 sections, 15 equations, 8 figures, 18 tables, 1 algorithm.

Figures (8)

  • Figure 1: (a) The parametric GCD classifier wen2023parametric is trained on labelled and unlabelled images using ground-truth hard labels and soft labels, respectively. (b) The auxiliary debiased learning: training another classifier using debiased labels. (c) The process of label debiasing: keep the hard labels unchanged and transform soft labels to one-hot hard labels; samples that do not meet the threshold are removed. (d) The illustration of distribution guidance: if a sample receives a high in-distribution/out-of-distribution score, its weight in GCD training will be increased accordingly.
  • Figure 2: Overview of the DebGCD framework. In the upper branch, raw features are transformed using an MLP and then normalized. These normalized features are used for semantic distribution learning with a one-vs-all classifier. In the lower branch, a GCD classifier is trained on the normalized raw features. The predictions from both branches are combined to train the debiased classifier. As DebGCD aligns with prior work in representation learning, it's not explicitly depicted here.
  • Figure 3: $t$-SNE visualization of 20 classes randomly sampled from the CIFAR-100 krizhevsky2009learning dataset.
  • Figure 4: Unlabelled data utilization ratios for 'Old' and 'New' classes during training on FGVC-Aircraft maji2013fine (left) and Stanford Cars krause20133d (right) datasets.
  • Figure 5: ACC evolution on both the 'Old' and 'New' classes of GCD Classifier and debiased classifier during training on Stanford Cars dataset krause20133d. The top two figures depict ACC on the unlabelled training set, while the bottom two illustrate ACC on the validation set.
  • ...and 3 more figures