Table of Contents
Fetching ...

CiPR: An Efficient Framework with Cross-instance Positive Relations for Generalized Category Discovery

Shaozhe Hao, Kai Han, Kwan-Yee K. Wong

TL;DR

CiPR tackles generalized category discovery in open-world settings with unknown unseen class counts by introducing cross-instance positive relations into contrastive learning. The framework hinges on SNC, a semi-supervised hierarchical clustering method that generates reliable pseudo-labels linking labelled and unlabelled data and enabling efficient one-shot class-number estimation via a joint reference score. With SNC-driven pseudo-labels, CiPR jointly optimizes two contrastive losses to learn balanced representations across seen and unseen classes, attaining state-of-the-art results on both generic and fine-grained datasets. The approach offers substantial gains in both accuracy and efficiency, notably in estimating the unknown class number with a single SNC run and without a predefined candidate set, making it practical for real-world open-world discovery.

Abstract

We tackle the issue of generalized category discovery (GCD). GCD considers the open-world problem of automatically clustering a partially labelled dataset, in which the unlabelled data may contain instances from both novel categories and labelled classes. In this paper, we address the GCD problem with an unknown category number for the unlabelled data. We propose a framework, named CiPR, to bootstrap the representation by exploiting Cross-instance Positive Relations in the partially labelled data for contrastive learning, which have been neglected in existing methods. To obtain reliable cross-instance relations to facilitate representation learning, we introduce a semi-supervised hierarchical clustering algorithm, named selective neighbor clustering (SNC), which can produce a clustering hierarchy directly from the connected components of a graph constructed from selective neighbors. We further present a method to estimate the unknown class number using SNC with a joint reference score that considers clustering indexes of both labelled and unlabelled data, and extend SNC to allow label assignment for the unlabelled instances with a given class number. We thoroughly evaluate our framework on public generic image recognition datasets and challenging fine-grained datasets, and establish a new state-of-the-art. Code: https://github.com/haoosz/CiPR

CiPR: An Efficient Framework with Cross-instance Positive Relations for Generalized Category Discovery

TL;DR

CiPR tackles generalized category discovery in open-world settings with unknown unseen class counts by introducing cross-instance positive relations into contrastive learning. The framework hinges on SNC, a semi-supervised hierarchical clustering method that generates reliable pseudo-labels linking labelled and unlabelled data and enabling efficient one-shot class-number estimation via a joint reference score. With SNC-driven pseudo-labels, CiPR jointly optimizes two contrastive losses to learn balanced representations across seen and unseen classes, attaining state-of-the-art results on both generic and fine-grained datasets. The approach offers substantial gains in both accuracy and efficiency, notably in estimating the unknown class number with a single SNC run and without a predefined candidate set, making it practical for real-world open-world discovery.

Abstract

We tackle the issue of generalized category discovery (GCD). GCD considers the open-world problem of automatically clustering a partially labelled dataset, in which the unlabelled data may contain instances from both novel categories and labelled classes. In this paper, we address the GCD problem with an unknown category number for the unlabelled data. We propose a framework, named CiPR, to bootstrap the representation by exploiting Cross-instance Positive Relations in the partially labelled data for contrastive learning, which have been neglected in existing methods. To obtain reliable cross-instance relations to facilitate representation learning, we introduce a semi-supervised hierarchical clustering algorithm, named selective neighbor clustering (SNC), which can produce a clustering hierarchy directly from the connected components of a graph constructed from selective neighbors. We further present a method to estimate the unknown class number using SNC with a joint reference score that considers clustering indexes of both labelled and unlabelled data, and extend SNC to allow label assignment for the unlabelled instances with a given class number. We thoroughly evaluate our framework on public generic image recognition datasets and challenging fine-grained datasets, and establish a new state-of-the-art. Code: https://github.com/haoosz/CiPR
Paper Structure (38 sections, 10 equations, 10 figures, 17 tables, 2 algorithms)

This paper contains 38 sections, 10 equations, 10 figures, 17 tables, 2 algorithms.

Figures (10)

  • Figure 1: Generalized category discovery: given an image dataset with seen classes from a labelled subset, categorize the unlabelled images, which may come from seen and unseen classes.
  • Figure 2: Overview of our CiPR framework. We first initialize ViT with pretrained DINO caron2021emerging to obtain a good representation space. We then finetune ViT by conducting joint contrastive learning with both true and pseudo positive relations in a supervised manner. True positive relations come from labelled data while pseudo positive relations of all data are generated by our proposed SNC algorithm. Specifically, SNC generates a hierarchical clustering structure. Pseudo positive relations are granted to all instances in the same cluster at one level of partition, further exploited in joint contrastive learning. With representations well learned, we estimate class number and assign labels to all unlabelled data using SNC with a one-to-one merging strategy.
  • Figure 3: Selective neighbor rules. Left: the labelled instances constitute a chain (rule 1) with the length of $\lambda=4$ (rule 2) and the nearest neighbors of unlabelled instances are labelled ones (rule 3). Right: the nearest neighbors of unlabelled instances are unlabelled ones (rule 3).
  • Figure 4: Visualization on CIFAR-10. We conduct t-SNE projection on features extracted by raw DINO, GCD method of vaze22generalized and our CiPR. We randomly sample 1000 images of each class from CIFAR-10 to visualize. Unseen categories are marked with *.
  • Figure 5: Purity curve. We plot the purity of all compared clustering methods throughout training.
  • ...and 5 more figures