When Domain Generalization meets Generalized Category Discovery: An Adaptive Task-Arithmetic Driven Approach
Vaibhav Rathore, Shubhranil B, Saikat Dutta, Sarthak Mehrotra, Zsolt Kira, Biplab Banerjee
TL;DR
DG-GCD addresses clustering of known and novel classes under domain shifts while withholding target-domain data during training. It introduces DG2CD-Net, which leverages episodic training with source and synthetic domains and an adaptive task-vector aggregation to build a domain-independent embedding space suitable for generalized category discovery. A margin-based open-set domain-adaptation objective, coupled with supervised and unsupervised contrastive losses, helps separate known and novel classes, while a validation-driven weighting scheme selects the most generalizable episode models for updating the global model. Synthetic domains generated by Instruct-Pix2Pix, prompted via ChatGPT, diversify training and improve generalization without leaking target-domain information. Empirical results on PACS, Office-Home, and DomainNet demonstrate strong performance gains over existing DG-GCD baselines and ablation analyses highlight the importance of synthetic data, episodic updates, and the adaptive task-vector mechanism for robust domain generalization and fine-grained novel category discovery.
Abstract
Generalized Class Discovery (GCD) clusters base and novel classes in a target domain using supervision from a source domain with only base classes. Current methods often falter with distribution shifts and typically require access to target data during training, which can sometimes be impractical. To address this issue, we introduce the novel paradigm of Domain Generalization in GCD (DG-GCD), where only source data is available for training, while the target domain, with a distinct data distribution, remains unseen until inference. To this end, our solution, DG2CD-Net, aims to construct a domain-independent, discriminative embedding space for GCD. The core innovation is an episodic training strategy that enhances cross-domain generalization by adapting a base model on tasks derived from source and synthetic domains generated by a foundation model. Each episode focuses on a cross-domain GCD task, diversifying task setups over episodes and combining open-set domain adaptation with a novel margin loss and representation learning for optimizing the feature space progressively. To capture the effects of fine-tuning on the base model, we extend task arithmetic by adaptively weighting the local task vectors concerning the fine-tuned models based on their GCD performance on a validation distribution. This episodic update mechanism boosts the adaptability of the base model to unseen targets. Experiments across three datasets confirm that DG2CD-Net outperforms existing GCD methods customized for DG-GCD.
