Revisiting Mutual Information Maximization for Generalized Category Discovery
Zhaorui Tan, Chengrui Zhang, Xi Yang, Jie Sun, Kaizhu Huang
TL;DR
This work tackles generalized category discovery (GCD), where unlabeled data contain both known and unknown classes. It introduces Regularized Parametric InfoMax (RPIM), an InfoMax-based framework that enforces an independence constraint between known and unknown predictions while assuming a uniform class prior on unconfident outputs, and leverages a semantic-bias transformation to refine latent features without costly fine-tuning. RPIM uses Sinkhorn to generate soft pseudo-labels, selects confident pseudo-labels via a threshold, and adds a regularization term $L_R$ to promote reliable pseudo-labels and strengthen class separation; it also includes a regularization $L_S$ to align pseudo-labels with labeled anchors. Theoretical analysis and experiments on six datasets show RPIM achieves state-of-the-art performance, notably improving unknown-class accuracy by an average of about $3.5\%$ and delivering substantial gains on several benchmark datasets, while maintaining efficiency through latent-feature refinement rather than full fine-tuning. This approach enables more scalable open-world recognition by robustly discovering unknown categories with improved separation from known classes.
Abstract
Generalized category discovery presents a challenge in a realistic scenario, which requires the model's generalization ability to recognize unlabeled samples from known and unknown categories. This paper revisits the challenge of generalized category discovery through the lens of information maximization (InfoMax) with a probabilistic parametric classifier. Our findings reveal that ensuring independence between known and unknown classes while concurrently assuming a uniform probability distribution across all classes, yields an enlarged margin among known and unknown classes that promotes the model's performance. To achieve the aforementioned independence, we propose a novel InfoMax-based method, Regularized Parametric InfoMax (RPIM), which adopts pseudo labels to supervise unlabeled samples during InfoMax, while proposing a regularization to ensure the quality of the pseudo labels. Additionally, we introduce novel semantic-bias transformation to refine the features from the pre-trained model instead of direct fine-tuning to rescue the computational costs. Extensive experiments on six benchmark datasets validate the effectiveness of our method. RPIM significantly improves the performance regarding unknown classes, surpassing the state-of-the-art method by an average margin of 3.5%.
