Fine-Grained Representation Learning via Multi-Level Contrastive Learning without Class Priors
Houwang Jiang, Zhuxian Liu, Guodong Liu, Xiaolong Liu, Shihua Zhan
TL;DR
The paper addresses the reliance on known class counts in unsupervised representation learning and proposes Contrastive Disentangling (CD), a class-prior-free framework that combines instance-level and feature-level contrastive learning with a normalized entropy-based disentanglement loss. CD introduces feature prediction heads to capture fine-grained attributes and treats each head as an independent learning objective, promoting diversity while preventing feature collapse. Through extensive experiments on CIFAR-10, CIFAR-100-20, STL-10, and ImageNet-10, CD outperforms traditional unsupervised methods and remains competitive with class-informed approaches, with ablations confirming the contributions of the feature-level heads, entropy loss, and dual-view augmentations. The work advances practical unsupervised learning by delivering more semantically rich and interpretable representations without requiring class priors, enabling flexible feature granularity and improved clustering performance.
Abstract
Recent advances in unsupervised representation learning often rely on knowing the number of classes to improve feature extraction and clustering. However, this assumption raises an important question: is the number of classes always necessary, and do class labels fully capture the fine-grained features within the data? In this paper, we propose Contrastive Disentangling (CD), a framework designed to learn representations without relying on class priors. CD leverages a multi-level contrastive learning strategy, integrating instance-level and feature-level contrastive losses with a normalized entropy loss to capture semantically rich and fine-grained representations. Specifically, (1) the instance-level contrastive loss separates feature representations across samples; (2) the feature-level contrastive loss promotes independence among feature heads; and (3) the normalized entropy loss ensures feature diversity and prevents feature collapse. Extensive experiments on CIFAR-10, CIFAR-100, STL-10, and ImageNet-10 demonstrate that CD outperforms existing methods in scenarios where class information is unavailable or ambiguous. The code is available at https://github.com/Hoper-J/Contrastive-Disentangling.
