Table of Contents
Fetching ...

DynaMix: Generalizable Person Re-identification via Dynamic Relabeling and Mixed Data Sampling

Timur Mamedov, Anton Konushin, Vadim Konushin

TL;DR

DynaMix tackles generalizable person Re-ID by integrating manually labeled multi-camera data with large-scale pseudo-labeled single-camera data. It introduces three tightly coupled modules—Relabeling, Efficient Centroids, and Data Sampling—that dynamically refine labels, maintain scalable identity representations, and form balanced mixed data batches. The method employs a ViT-based encoder with a momentum encoder and a multi-loss objective to learn robust, cross-domain representations, demonstrating state-of-the-art performance on cross-dataset and multi-source benchmarks. Its scalable design and strong generalization have practical implications for real-world surveillance deployments across diverse environments.

Abstract

Generalizable person re-identification (Re-ID) aims to recognize individuals across unseen cameras and environments. While existing methods rely heavily on limited labeled multi-camera data, we propose DynaMix, a novel method that effectively combines manually labeled multi-camera and large-scale pseudo-labeled single-camera data. Unlike prior works, DynaMix dynamically adapts to the structure and noise of the training data through three core components: (1) a Relabeling Module that refines pseudo-labels of single-camera identities on-the-fly; (2) an Efficient Centroids Module that maintains robust identity representations under a large identity space; and (3) a Data Sampling Module that carefully composes mixed data mini-batches to balance learning complexity and intra-batch diversity. All components are specifically designed to operate efficiently at scale, enabling effective training on millions of images and hundreds of thousands of identities. Extensive experiments demonstrate that DynaMix consistently outperforms state-of-the-art methods in generalizable person Re-ID.

DynaMix: Generalizable Person Re-identification via Dynamic Relabeling and Mixed Data Sampling

TL;DR

DynaMix tackles generalizable person Re-ID by integrating manually labeled multi-camera data with large-scale pseudo-labeled single-camera data. It introduces three tightly coupled modules—Relabeling, Efficient Centroids, and Data Sampling—that dynamically refine labels, maintain scalable identity representations, and form balanced mixed data batches. The method employs a ViT-based encoder with a momentum encoder and a multi-loss objective to learn robust, cross-domain representations, demonstrating state-of-the-art performance on cross-dataset and multi-source benchmarks. Its scalable design and strong generalization have practical implications for real-world surveillance deployments across diverse environments.

Abstract

Generalizable person re-identification (Re-ID) aims to recognize individuals across unseen cameras and environments. While existing methods rely heavily on limited labeled multi-camera data, we propose DynaMix, a novel method that effectively combines manually labeled multi-camera and large-scale pseudo-labeled single-camera data. Unlike prior works, DynaMix dynamically adapts to the structure and noise of the training data through three core components: (1) a Relabeling Module that refines pseudo-labels of single-camera identities on-the-fly; (2) an Efficient Centroids Module that maintains robust identity representations under a large identity space; and (3) a Data Sampling Module that carefully composes mixed data mini-batches to balance learning complexity and intra-batch diversity. All components are specifically designed to operate efficiently at scale, enabling effective training on millions of images and hundreds of thousands of identities. Extensive experiments demonstrate that DynaMix consistently outperforms state-of-the-art methods in generalizable person Re-ID.

Paper Structure

This paper contains 35 sections, 3 equations, 4 figures, 12 tables.

Figures (4)

  • Figure 1: $Rank_1$ scores of DynaMix and other state-of-the-art methods in the multi-source cross-dataset scenario (D+C3+MS $\rightarrow$ M configuration in Tab. \ref{['tab:comparison-multi']}).
  • Figure 2: Overview of DynaMix. Our method leverages both labeled multi-camera and pseudo-labeled single-camera data during training. It consists of three mutually dependent components: the Relabeling Module dynamically refines noisy pseudo-labels; the Efficient Centroids Module enables scalable identity representation updates; and the queue-based Data Sampling Module composes mixed data mini-batches to balance learning complexity and intra-batch diversity.
  • Figure 3: Illustration of the Relabeling Module stages. (a) Filtering & Relabeling: the module assesses the similarity between feature vectors and centroids, removing images with low similarity and updating pseudo-labels if a better match is found. (b) PIDs Merging: constructs a graph based on centroid similarity where centroids with strong connections are merged, consolidating PIDs that likely belong to the same individual. In this illustration, circles represent centroids, and squares denote feature vectors for images. Matching colors indicate similarity in PIDs.
  • Figure 4: Examples of mini-batches composed by the Data Sampling Module. Single-camera samples are selected to visually match multi-camera data within mini-batch, while maintaining stylistic diversity across batches to balance learning complexity and data variety.