Table of Contents
Fetching ...

Enhancing person re-identification via Uncertainty Feature Fusion Method and Auto-weighted Measure Combination

Quang-Huy Che, Le-Chuong Nguyen, Duc-Tuan Luu, Vinh-Tiep Nguyen

TL;DR

This work tackles view bias in cross-camera person re-identification by introducing Uncertainty Feature Fusion Method (UFFM), which creates multi-view features from single-view embeddings through weighted nearest-neighbor fusion, and Auto-weighted Measure Combination (AMC), which learns optimal weights to fuse multiple similarity measures including cross-camera cues. Both methods operate at inference time, requiring no retraining of base models, and are integrated to compute a robust final similarity $S^* = \alpha S(q,g_j) + \beta S(q,URF_j) + \gamma CCE(q,g_j)$. Empirical results on Market-1501, DukeMTMC-ReID, MSMT17, and Occluded-DukeMTMC show substantial improvements in Rank@1 and mAP, with particularly large gains on MSMT17 and Occluded-DukeMTMC, validating the approach's effectiveness and generality across backbones. The findings suggest a practical pathway to enhance Re-ID systems in real-world multi-camera setups by leveraging unsupervised multi-view fusion and data-driven measure blending during inference.

Abstract

Person re-identification (Re-ID) is a challenging task that involves identifying the same person across different camera views in surveillance systems. Current methods usually rely on features from single-camera views, which can be limiting when dealing with multiple cameras and challenges such as changing viewpoints and occlusions. In this paper, a new approach is introduced that enhances the capability of ReID models through the Uncertain Feature Fusion Method (UFFM) and Auto-weighted Measure Combination (AMC). UFFM generates multi-view features using features extracted independently from multiple images to mitigate view bias. However, relying only on similarity based on multi-view features is limited because these features ignore the details represented in single-view features. Therefore, we propose the AMC method to generate a more robust similarity measure by combining various measures. Our method significantly improves Rank@1 accuracy and Mean Average Precision (mAP) when evaluated on person re-identification datasets. Combined with the BoT Baseline on challenging datasets, we achieve impressive results, with a 7.9% improvement in Rank@1 and a 12.1% improvement in mAP on the MSMT17 dataset. On the Occluded-DukeMTMC dataset, our method increases Rank@1 by 22.0% and mAP by 18.4%. Code is available: https://github.com/chequanghuy/Enhancing-Person-Re-Identification-via-UFFM-and-AMC

Enhancing person re-identification via Uncertainty Feature Fusion Method and Auto-weighted Measure Combination

TL;DR

This work tackles view bias in cross-camera person re-identification by introducing Uncertainty Feature Fusion Method (UFFM), which creates multi-view features from single-view embeddings through weighted nearest-neighbor fusion, and Auto-weighted Measure Combination (AMC), which learns optimal weights to fuse multiple similarity measures including cross-camera cues. Both methods operate at inference time, requiring no retraining of base models, and are integrated to compute a robust final similarity . Empirical results on Market-1501, DukeMTMC-ReID, MSMT17, and Occluded-DukeMTMC show substantial improvements in Rank@1 and mAP, with particularly large gains on MSMT17 and Occluded-DukeMTMC, validating the approach's effectiveness and generality across backbones. The findings suggest a practical pathway to enhance Re-ID systems in real-world multi-camera setups by leveraging unsupervised multi-view fusion and data-driven measure blending during inference.

Abstract

Person re-identification (Re-ID) is a challenging task that involves identifying the same person across different camera views in surveillance systems. Current methods usually rely on features from single-camera views, which can be limiting when dealing with multiple cameras and challenges such as changing viewpoints and occlusions. In this paper, a new approach is introduced that enhances the capability of ReID models through the Uncertain Feature Fusion Method (UFFM) and Auto-weighted Measure Combination (AMC). UFFM generates multi-view features using features extracted independently from multiple images to mitigate view bias. However, relying only on similarity based on multi-view features is limited because these features ignore the details represented in single-view features. Therefore, we propose the AMC method to generate a more robust similarity measure by combining various measures. Our method significantly improves Rank@1 accuracy and Mean Average Precision (mAP) when evaluated on person re-identification datasets. Combined with the BoT Baseline on challenging datasets, we achieve impressive results, with a 7.9% improvement in Rank@1 and a 12.1% improvement in mAP on the MSMT17 dataset. On the Occluded-DukeMTMC dataset, our method increases Rank@1 by 22.0% and mAP by 18.4%. Code is available: https://github.com/chequanghuy/Enhancing-Person-Re-Identification-via-UFFM-and-AMC
Paper Structure (23 sections, 5 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 23 sections, 5 equations, 7 figures, 6 tables, 1 algorithm.

Figures (7)

  • Figure 1: t-SNE visualization: Each cluster separates queries and their results from the others within a cluster comprising the top 20 query results. In a boundary, the star pentagon indicates the query feature, while dots of the same color as the petagrams represent correct query results and gray dots indicate incorrect results. The left figure shows the visual result of the Baseline, while the right figure shows the result of the Baseline combined with the proposed UFFM. The visual result demonstrates that when applying UFFM, the clusters tend to be separated, and the number of true positive points is greater than the number of false positive points in each cluster.
  • Figure 2: Example of pedestrian images. The Market-1501 and DukeMTMC-ReID datasets contain images of the same individuals from different camera views. In the examples from the Market1501 dataset, the red rectangles highlight information that appears in one frame but not in others. In contrast, in the DukeMTMC-ReID dataset, these red boxes refer to occluded regions where information about the person needs to be extracted.
  • Figure 3: Overview of our proposed pipeline: To determine the similarity between $q$ and $g_j$, we need to compute $\textbf{CCE}{(q, g_j)}$, $\textbf{S}{(q, g_j)}$, $\textbf{S}{(q, URF_j)}$ to generate the final similarity $\textbf{S}^*$. Where $q$ represents the query image and $g_j \in \mathcal{G}$ represents the $j^{th}$ image in the gallery. With the camera information for both images, we can easily compute $\textbf{CCE}{(q, g_j)}$. The shared backbone processes $q$ and $g_j$ for feature extraction. We directly compute $\textbf{S}{(q, g_j)}$ via the similarity of $q$ and single-view feature of $g_j$. Furthermore, we compute the similarity between $g_j$ and the images in the gallery to find the $K$ nearest similar images to $g_j$. Using weighted fusion, we obtain $f^{URF}_j$, from which we can easily compute $\textbf{S}{(q, URF_j)}$. Finally, the AMC method combines different measures and produces the robust similarity between $q$ and $g_j$ as $\textbf{S}^*$.
  • Figure 4: An example of finding K nearest neighbors based on features, with K=7. The black box represents the query object, the green boxes indicate correct results and the red boxes denote incorrect results.
  • Figure 5: The impact of $K$ on Market-1501 when using our proposed methods
  • ...and 2 more figures