Table of Contents
Fetching ...

Rethinking Sampling Strategies for Unsupervised Person Re-identification

Xumeng Han, Xuehui Yu, Guorong Li, Jian Zhao, Gang Pan, Qixiang Ye, Jianbin Jiao, Zhenjun Han

TL;DR

This paper identifies sampling strategy as a crucial factor in unsupervised person re-ID, introducing deteriorated over-fitting and statistical stability to explain why random sampling collapses learning while structured sampling can succeed. It proposes group sampling, which groups samples from the same class to emphasize class-wide trends and suppress the influence of individual samples, thereby improving pseudo-label quality and representation learning. Through extensive experiments on Market-1501, DukeMTMC-reID, and MSMT17, group sampling achieves competitive or superior performance to state-of-the-art fully unsupervised methods without additional parameters or computation, particularly under camera-agnostic settings. The findings highlight a practical, low-cost approach to enhance unsupervised re-ID by focusing on sampling design to sustain within-class coherence and inter-class separation during contrastive learning.

Abstract

Unsupervised person re-identification (re-ID) remains a challenging task. While extensive research has focused on the framework design and loss function, this paper shows that sampling strategy plays an equally important role. We analyze the reasons for the performance differences between various sampling strategies under the same framework and loss function. We suggest that deteriorated over-fitting is an important factor causing poor performance, and enhancing statistical stability can rectify this problem. Inspired by that, a simple yet effective approach is proposed, termed group sampling, which gathers samples from the same class into groups. The model is thereby trained using normalized group samples, which helps alleviate the negative impact of individual samples. Group sampling updates the pipeline of pseudo-label generation by guaranteeing that samples are more efficiently classified into the correct classes. It regulates the representation learning process, enhancing statistical stability for feature representation in a progressive fashion. Extensive experiments on Market-1501, DukeMTMC-reID and MSMT17 show that group sampling achieves performance comparable to state-of-the-art methods and outperforms the current techniques under purely camera-agnostic settings. Code has been available at https://github.com/ucas-vg/GroupSampling.

Rethinking Sampling Strategies for Unsupervised Person Re-identification

TL;DR

This paper identifies sampling strategy as a crucial factor in unsupervised person re-ID, introducing deteriorated over-fitting and statistical stability to explain why random sampling collapses learning while structured sampling can succeed. It proposes group sampling, which groups samples from the same class to emphasize class-wide trends and suppress the influence of individual samples, thereby improving pseudo-label quality and representation learning. Through extensive experiments on Market-1501, DukeMTMC-reID, and MSMT17, group sampling achieves competitive or superior performance to state-of-the-art fully unsupervised methods without additional parameters or computation, particularly under camera-agnostic settings. The findings highlight a practical, low-cost approach to enhance unsupervised re-ID by focusing on sampling design to sustain within-class coherence and inter-class separation during contrastive learning.

Abstract

Unsupervised person re-identification (re-ID) remains a challenging task. While extensive research has focused on the framework design and loss function, this paper shows that sampling strategy plays an equally important role. We analyze the reasons for the performance differences between various sampling strategies under the same framework and loss function. We suggest that deteriorated over-fitting is an important factor causing poor performance, and enhancing statistical stability can rectify this problem. Inspired by that, a simple yet effective approach is proposed, termed group sampling, which gathers samples from the same class into groups. The model is thereby trained using normalized group samples, which helps alleviate the negative impact of individual samples. Group sampling updates the pipeline of pseudo-label generation by guaranteeing that samples are more efficiently classified into the correct classes. It regulates the representation learning process, enhancing statistical stability for feature representation in a progressive fashion. Extensive experiments on Market-1501, DukeMTMC-reID and MSMT17 show that group sampling achieves performance comparable to state-of-the-art methods and outperforms the current techniques under purely camera-agnostic settings. Code has been available at https://github.com/ucas-vg/GroupSampling.

Paper Structure

This paper contains 46 sections, 8 equations, 5 figures, 8 tables, 2 algorithms.

Figures (5)

  • Figure 1: (Top): Individual samples mislead the optimization trend of overall features of the class, leading to the strengthening of undesirable similarity structure and the destruction of identity structure. As a result, the feature representation tends to deteriorated over-fitting. (Bottom): Grouping the samples belonging to the same class adopts the overall class trend and weakens the influence of individual samples, thereby forming statistical stability within the class. (Best viewed in color.)
  • Figure 2: The degree of purity and chaos for different sampling strategies on Market-1501. It intuitively reflects the deteriorated over-fitting phenomenon caused by random sampling, and triplet sampling and group sampling can suppress this phenomenon. (Best viewed in color.)
  • Figure 3: (Left): The framework of the contrastive baseline. The features of each sample are dynamically stored in the memory bank and clustered to generate pseudo-labels. After sampling, the samples are fed into the model for feature extraction and optimized using pseudo-labels. Samples with the same color belong to the same cluster, and the black ones represent outliers. (Right): Illustration of group sampling with group size $N=2$, which groups samples belonging to the same class for training. More details are described in Sec. \ref{['sec:group-sampling']} and Alg. \ref{['alg:gather-sampling']}. (Best viewed in color.)
  • Figure 4: Comparison and analysis between random sampling, triplet sampling and group sampling. It is verified that random sampling leads to low-quality pseudo-labels and the deterioration of class features. In contrast, group sampling maintains the statistically stability within classes, resulting in purer classes and higher quality pseudo-labels. (Best viewed in color.)
  • Figure 5: T-SNE visualization of the distribution of samples in the feature space during training. Samples of the same color belong to the same identity. Dotted circles indicate clusters, and samples not classified into clusters are outliers. It intuitively shows that group sampling successfully gathers samples with the same identity compared to random sampling, indicating that it facilitates enhanced feature representation. (Best viewed in color.)