Camera-aware Label Refinement for Unsupervised Person Re-identification
Pengna Li, Kangyi Wu, Wenli Huang, Sanping Zhou, Jinjun Wang
TL;DR
This work tackles unsupervised person re-identification under cross-camera distribution shifts and pseudo-label noise. It introduces Camera-Aware Label Refinement (CALR), combining intra-camera clustering to obtain reliable local pseudo labels, a pivot-based, self-paced inter-camera label refinement, and a camera-domain alignment module via a gradient reversal layer to reduce camera-induced feature distribution gaps. The method leverages two-stage training with cluster memories and a refined inter-camera contrastive objective, yielding substantial gains over both purely unsupervised and UDA baselines across Market-1501, DukeMTMC-ReID, MSMT17, Veri-776, and a self-collected dataset. The results demonstrate CALR’s effectiveness in producing accurate, camera-invariant representations that improve cross-camera Re-ID performance in realistic settings.
Abstract
Unsupervised person re-identification aims to retrieve images of a specified person without identity labels. Many recent unsupervised Re-ID approaches adopt clustering-based methods to measure cross-camera feature similarity to roughly divide images into clusters. They ignore the feature distribution discrepancy induced by camera domain gap, resulting in the unavoidable performance degradation. Camera information is usually available, and the feature distribution in the single camera usually focuses more on the appearance of the individual and has less intra-identity variance. Inspired by the observation, we introduce a \textbf{C}amera-\textbf{A}ware \textbf{L}abel \textbf{R}efinement~(CALR) framework that reduces camera discrepancy by clustering intra-camera similarity. Specifically, we employ intra-camera training to obtain reliable local pseudo labels within each camera, and then refine global labels generated by inter-camera clustering and train the discriminative model using more reliable global pseudo labels in a self-paced manner. Meanwhile, we develop a camera-alignment module to align feature distributions under different cameras, which could help deal with the camera variance further. Extensive experiments validate the superiority of our proposed method over state-of-the-art approaches. The code is accessible at https://github.com/leeBooMla/CALR.
