Table of Contents
Fetching ...

ProtoGCD: Unified and Unbiased Prototype Learning for Generalized Category Discovery

Shijie Ma, Fei Zhu, Xu-Yao Zhang, Cheng-Lin Liu

TL;DR

ProtoGCD tackles generalized category discovery by unifying old and new classes under a shared prototypical classifier and end-to-end learning. It introduces dual-level adaptive pseudo-labeling (DAPL), entropy-based regularization, and a prototype separation term to learn unbiased, discriminative representations while avoiding confirmation bias. A practical Prototype Score criterion estimates the number of novel classes, and the framework extends to unseen outlier detection for open-world applicability. Across generic and fine-grained datasets, ProtoGCD achieves state-of-the-art performance with balanced old/new accuracy and strong OOD detection capabilities, underscoring the value of unified prototype learning for open-world clustering and classification.

Abstract

Generalized category discovery (GCD) is a pragmatic but underexplored problem, which requires models to automatically cluster and discover novel categories by leveraging the labeled samples from old classes. The challenge is that unlabeled data contain both old and new classes. Early works leveraging pseudo-labeling with parametric classifiers handle old and new classes separately, which brings about imbalanced accuracy between them. Recent methods employing contrastive learning neglect potential positives and are decoupled from the clustering objective, leading to biased representations and sub-optimal results. To address these issues, we introduce a unified and unbiased prototype learning framework, namely ProtoGCD, wherein old and new classes are modeled with joint prototypes and unified learning objectives, {enabling unified modeling between old and new classes}. Specifically, we propose a dual-level adaptive pseudo-labeling mechanism to mitigate confirmation bias, together with two regularization terms to collectively help learn more suitable representations for GCD. Moreover, for practical considerations, we devise a criterion to estimate the number of new classes. Furthermore, we extend ProtoGCD to detect unseen outliers, achieving task-level unification. Comprehensive experiments show that ProtoGCD achieves state-of-the-art performance on both generic and fine-grained datasets. The code is available at https://github.com/mashijie1028/ProtoGCD.

ProtoGCD: Unified and Unbiased Prototype Learning for Generalized Category Discovery

TL;DR

ProtoGCD tackles generalized category discovery by unifying old and new classes under a shared prototypical classifier and end-to-end learning. It introduces dual-level adaptive pseudo-labeling (DAPL), entropy-based regularization, and a prototype separation term to learn unbiased, discriminative representations while avoiding confirmation bias. A practical Prototype Score criterion estimates the number of novel classes, and the framework extends to unseen outlier detection for open-world applicability. Across generic and fine-grained datasets, ProtoGCD achieves state-of-the-art performance with balanced old/new accuracy and strong OOD detection capabilities, underscoring the value of unified prototype learning for open-world clustering and classification.

Abstract

Generalized category discovery (GCD) is a pragmatic but underexplored problem, which requires models to automatically cluster and discover novel categories by leveraging the labeled samples from old classes. The challenge is that unlabeled data contain both old and new classes. Early works leveraging pseudo-labeling with parametric classifiers handle old and new classes separately, which brings about imbalanced accuracy between them. Recent methods employing contrastive learning neglect potential positives and are decoupled from the clustering objective, leading to biased representations and sub-optimal results. To address these issues, we introduce a unified and unbiased prototype learning framework, namely ProtoGCD, wherein old and new classes are modeled with joint prototypes and unified learning objectives, {enabling unified modeling between old and new classes}. Specifically, we propose a dual-level adaptive pseudo-labeling mechanism to mitigate confirmation bias, together with two regularization terms to collectively help learn more suitable representations for GCD. Moreover, for practical considerations, we devise a criterion to estimate the number of new classes. Furthermore, we extend ProtoGCD to detect unseen outliers, achieving task-level unification. Comprehensive experiments show that ProtoGCD achieves state-of-the-art performance on both generic and fine-grained datasets. The code is available at https://github.com/mashijie1028/ProtoGCD.

Paper Structure

This paper contains 52 sections, 5 theorems, 28 equations, 16 figures, 19 tables, 1 algorithm.

Key Result

Proposition 1

Under Assumption assumption:cluster, entropy minimization on unlabeled data helps ensure that classes are well-separated.

Figures (16)

  • Figure 1: Generalized category discovery. Given a dataset with labeled data from old classes and unlabeled data from both old and novel categories. The objective is to classify old classes and cluster new categories in the unlabeled data.
  • Figure 2: The unified and unbiased characteristics of ProtoGCD, which contribute to addressing the issues of prior methods.
  • Figure 3: Unified prototype learning framework. (a) Previous GCD methods Han2020AutomaticallyFini_2021_ICCVcao2022openworld with parametric classifiers employ distinct classification heads or training objectives for old and new classes, while (b) ProtoGCD models old and new classes in a shared feature space with a unified set of prototypes (i.e., classifier) and adopts unified learning objectives across old and new classes. (c) During inference, ProtoGCD could classify both the old and the newly discovered classes. Moreover, it could also be extended to reject unseen outliers, which makes ProtoGCD a general-purpose open-world classifier.
  • Figure 4: The proposed method ProtoGCD. Left: Overview of ProtoGCD. The blue, purple and orange backgrounds indicate the projection, feature and probability space, respectively. The yellow font represents learning objectives. Right: Dual-Level Adaptive Pseudo-Labeling (DAPL). We adaptively assign hard pseudo-labels to top $r\%$ samples by confidence while soft ones for the others, and the ratio $r\%$ adaptively ramps up (blue font). ProtoGCD could be trained end-to-end.
  • Figure 5: Results on different scores for class number estimation on CIFAR10 (a) and ImageNet-100 (b), and the ground-truth classes numbers $\widetilde{K}$ are 10 and 100, respectively.
  • ...and 11 more figures

Theorems & Definitions (7)

  • Definition 1: Prototype Confidence of Each Sample
  • Proposition 1: Entropy Minimization grandvalet2004semi in SSL
  • Proposition 2: Pseudo-labeling in SSL
  • Theorem 1: Performance Gap of Pseudo-labeling Methods xie2023classdistributionaware
  • Theorem 2
  • Theorem 2
  • proof