Table of Contents
Fetching ...

AdaptGCD: Multi-Expert Adapter Tuning for Generalized Category Discovery

Yuxun Qu, Yongqiang Tang, Chenyang Zhang, Wensheng Zhang

TL;DR

This work tackles Generalized Category Discovery (GCD), where unlabeled data include unseen categories not present in the labeled set. It introduces AdaptGCD, an adapter-tuning framework for ViT backbones that preserves pretrained knowledge while enabling task-specific adaptation, implemented via a Multi-Expert Adapter (MEA) and a route assignment constraint to separate old vs. new class data. The method leverages backbone freezing with learnable bottlenecks and gating across $T$ experts, guided by balanced and category-aware routing losses expressed as $\mathcal{L}_{ba}$ and $\mathcal{L}_{cba}$, and combines with SimGCD losses for representation learning and classification. Empirical results on 7 datasets (generic, fine-grained, and long-tailed) show consistent improvements over strong baselines and compatibility as a plug-in to SPTNet, validating adapter tuning as a robust solution for GCD. The findings highlight the practical impact of preserving pretrained representations while enabling targeted adaptation to open-world categorization tasks, with future directions including integration with RepAdapter or GLoRA techniques.

Abstract

Different from the traditional semi-supervised learning paradigm that is constrained by the close-world assumption, Generalized Category Discovery (GCD) presumes that the unlabeled dataset contains new categories not appearing in the labeled set, and aims to not only classify old categories but also discover new categories in the unlabeled data. Existing studies on GCD typically devote to transferring the general knowledge from the self-supervised pretrained model to the target GCD task via some fine-tuning strategies, such as partial tuning and prompt learning. Nevertheless, these fine-tuning methods fail to make a sound balance between the generalization capacity of pretrained backbone and the adaptability to the GCD task. To fill this gap, in this paper, we propose a novel adapter-tuning-based method named AdaptGCD, which is the first work to introduce the adapter tuning into the GCD task and provides some key insights expected to enlighten future research. Furthermore, considering the discrepancy of supervision information between the old and new classes, a multi-expert adapter structure equipped with a route assignment constraint is elaborately devised, such that the data from old and new classes are separated into different expert groups. Extensive experiments are conducted on 7 widely-used datasets. The remarkable improvements in performance highlight the effectiveness of our proposals.

AdaptGCD: Multi-Expert Adapter Tuning for Generalized Category Discovery

TL;DR

This work tackles Generalized Category Discovery (GCD), where unlabeled data include unseen categories not present in the labeled set. It introduces AdaptGCD, an adapter-tuning framework for ViT backbones that preserves pretrained knowledge while enabling task-specific adaptation, implemented via a Multi-Expert Adapter (MEA) and a route assignment constraint to separate old vs. new class data. The method leverages backbone freezing with learnable bottlenecks and gating across experts, guided by balanced and category-aware routing losses expressed as and , and combines with SimGCD losses for representation learning and classification. Empirical results on 7 datasets (generic, fine-grained, and long-tailed) show consistent improvements over strong baselines and compatibility as a plug-in to SPTNet, validating adapter tuning as a robust solution for GCD. The findings highlight the practical impact of preserving pretrained representations while enabling targeted adaptation to open-world categorization tasks, with future directions including integration with RepAdapter or GLoRA techniques.

Abstract

Different from the traditional semi-supervised learning paradigm that is constrained by the close-world assumption, Generalized Category Discovery (GCD) presumes that the unlabeled dataset contains new categories not appearing in the labeled set, and aims to not only classify old categories but also discover new categories in the unlabeled data. Existing studies on GCD typically devote to transferring the general knowledge from the self-supervised pretrained model to the target GCD task via some fine-tuning strategies, such as partial tuning and prompt learning. Nevertheless, these fine-tuning methods fail to make a sound balance between the generalization capacity of pretrained backbone and the adaptability to the GCD task. To fill this gap, in this paper, we propose a novel adapter-tuning-based method named AdaptGCD, which is the first work to introduce the adapter tuning into the GCD task and provides some key insights expected to enlighten future research. Furthermore, considering the discrepancy of supervision information between the old and new classes, a multi-expert adapter structure equipped with a route assignment constraint is elaborately devised, such that the data from old and new classes are separated into different expert groups. Extensive experiments are conducted on 7 widely-used datasets. The remarkable improvements in performance highlight the effectiveness of our proposals.

Paper Structure

This paper contains 20 sections, 14 equations, 7 figures, 9 tables.

Figures (7)

  • Figure 1: The description of the generalized category discovery (GCD) task. In this context, the unlabeled dataset contains new classes that are not present in the labeled set.
  • Figure 2: The framework of our proposed AdaptGCD. It contains two critical modules: the multi-expert adapter (MEA) structure and the route assignment constraint. The MEA introduces multiple adapter experts and the route assignment constraint, including the balanced assignment loss $\mathcal{L}_{ba}$ and category-aware balanced assignment loss $\mathcal{L}_{cba}$, guides the allocation of these experts.
  • Figure 3: Sensitivity analysis on three critical hyperparameters. (a) bottleneck dimension $\hat{d}$, (b) the number of adapted blocks $P$, (c) expert count $T$.
  • Figure 4: Attention visualization for 12 heads in the last blocks of backbone on CUB-200. "Pretrained(DINO)" represents the results from pretrained backbone without any additional training while "SimGCD" and "AdaptGCD" are those from models trained via the corresponding methods.
  • Figure 5: The confusion maps for the old and new classes for the models trained with and without the route assignment losses $\mathcal{L}_{ra}$ on the semantic shift benchmark.
  • ...and 2 more figures