GCC: Generative Calibration Clustering
Haifeng Xia, Hai Huang, Zhengming Ding
TL;DR
GCC tackles unsupervised image clustering by integrating a conditional diffusion-based generative augmentation with a discriminative, self-supervised clustering framework. It uses a two-branch architecture to align real and generated samples through discriminative feature matching in RKHS and via a reliable self-supervised metric, mitigating distribution shift and noisy supervision. The approach consists of Stage I (contrastive representation learning, pseudo-label clustering, and conditional diffusion generation) and Stage II (calibrated clustering with $L_d$, $L_{cwm}$, and $L_{ml}$), iteratively refining both the clustering and the generator. Across three benchmarks, GCC achieves state-of-the-art results and demonstrates robustness to imbalanced data, highlighting the practical value of generative-calibrated clustering for unsupervised representation learning.
Abstract
Deep clustering as an important branch of unsupervised representation learning focuses on embedding semantically similar samples into the identical feature space. This core demand inspires the exploration of contrastive learning and subspace clustering. However, these solutions always rely on the basic assumption that there are sufficient and category-balanced samples for generating valid high-level representation. This hypothesis actually is too strict to be satisfied for real-world applications. To overcome such a challenge, the natural strategy is utilizing generative models to augment considerable instances. How to use these novel samples to effectively fulfill clustering performance improvement is still difficult and under-explored. In this paper, we propose a novel Generative Calibration Clustering (GCC) method to delicately incorporate feature learning and augmentation into clustering procedure. First, we develop a discriminative feature alignment mechanism to discover intrinsic relationship across real and generated samples. Second, we design a self-supervised metric learning to generate more reliable cluster assignment to boost the conditional diffusion generation. Extensive experimental results on three benchmarks validate the effectiveness and advantage of our proposed method over the state-of-the-art methods.
