Table of Contents
Fetching ...

CA-Jaccard: Camera-aware Jaccard Distance for Person Re-identification

Yiyu Chen, Zheyi Fan, Zhaoru Chen, Yixuan Zhu

TL;DR

This work tackles the reliability problems of the Jaccard distance in unsupervised person re-ID caused by camera variation. It introduces CA-Jaccard, a camera-aware distance that replaces robust KRNNs with CKRNNs and local query expansion with CLQE to improve neighbor reliability and overlap-based similarity. Through comprehensive clustering and re-ranking experiments on Market1501, MSMT17, and VeRi-776, the method yields notable gains and is supported by ablations that confirm the contributions of CKRNNs and CLQE. The approach is simple, scalable, and broadly applicable, offering a robust, low-cost distance metric for unsupervised re-ID pipelines.

Abstract

Person re-identification (re-ID) is a challenging task that aims to learn discriminative features for person retrieval. In person re-ID, Jaccard distance is a widely used distance metric, especially in re-ranking and clustering scenarios. However, we discover that camera variation has a significant negative impact on the reliability of Jaccard distance. In particular, Jaccard distance calculates the distance based on the overlap of relevant neighbors. Due to camera variation, intra-camera samples dominate the relevant neighbors, which reduces the reliability of the neighbors by introducing intra-camera negative samples and excluding inter-camera positive samples. To overcome this problem, we propose a novel camera-aware Jaccard (CA-Jaccard) distance that leverages camera information to enhance the reliability of Jaccard distance. Specifically, we design camera-aware k-reciprocal nearest neighbors (CKRNNs) to find k-reciprocal nearest neighbors on the intra-camera and inter-camera ranking lists, which improves the reliability of relevant neighbors and guarantees the contribution of inter-camera samples in the overlap. Moreover, we propose a camera-aware local query expansion (CLQE) to mine reliable samples in relevant neighbors by exploiting camera variation as a strong constraint and assign these samples higher weights in overlap, further improving the reliability. Our CA-Jaccard distance is simple yet effective and can serve as a general distance metric for person re-ID methods with high reliability and low computational cost. Extensive experiments demonstrate the effectiveness of our method.

CA-Jaccard: Camera-aware Jaccard Distance for Person Re-identification

TL;DR

This work tackles the reliability problems of the Jaccard distance in unsupervised person re-ID caused by camera variation. It introduces CA-Jaccard, a camera-aware distance that replaces robust KRNNs with CKRNNs and local query expansion with CLQE to improve neighbor reliability and overlap-based similarity. Through comprehensive clustering and re-ranking experiments on Market1501, MSMT17, and VeRi-776, the method yields notable gains and is supported by ablations that confirm the contributions of CKRNNs and CLQE. The approach is simple, scalable, and broadly applicable, offering a robust, low-cost distance metric for unsupervised re-ID pipelines.

Abstract

Person re-identification (re-ID) is a challenging task that aims to learn discriminative features for person retrieval. In person re-ID, Jaccard distance is a widely used distance metric, especially in re-ranking and clustering scenarios. However, we discover that camera variation has a significant negative impact on the reliability of Jaccard distance. In particular, Jaccard distance calculates the distance based on the overlap of relevant neighbors. Due to camera variation, intra-camera samples dominate the relevant neighbors, which reduces the reliability of the neighbors by introducing intra-camera negative samples and excluding inter-camera positive samples. To overcome this problem, we propose a novel camera-aware Jaccard (CA-Jaccard) distance that leverages camera information to enhance the reliability of Jaccard distance. Specifically, we design camera-aware k-reciprocal nearest neighbors (CKRNNs) to find k-reciprocal nearest neighbors on the intra-camera and inter-camera ranking lists, which improves the reliability of relevant neighbors and guarantees the contribution of inter-camera samples in the overlap. Moreover, we propose a camera-aware local query expansion (CLQE) to mine reliable samples in relevant neighbors by exploiting camera variation as a strong constraint and assign these samples higher weights in overlap, further improving the reliability. Our CA-Jaccard distance is simple yet effective and can serve as a general distance metric for person re-ID methods with high reliability and low computational cost. Extensive experiments demonstrate the effectiveness of our method.
Paper Structure (22 sections, 12 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 22 sections, 12 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: (a) Illustration of the average proportion of intra-camera and inter-camera samples in k-nearest neighbors of all samples. Due to camera variation, the average proportion of intra-camera samples in all samples' k-nearest neighbors is significantly higher than that of inter-camera samples. (b) Comparison of the feature spaces of using Jaccard distance and our CA-Jaccard distance. Different colors represent different identities and different shapes indicate different camera labels.
  • Figure 2: Schematic diagram of CA-Jaccard distance. (a) Overview of CA-Jaccard distance computation. Given the original distance matrix, find CKRNNs and encode them as vectors, then use CLQE to generate weighted expanded neighbors vectors. Finally, calculate the overlap between these vectors to obtain the CA-Jaccard distance matrix. (b) CKRNNs. CKRNNs find reliable relevant neighbors by applying the k-reciprocal nearest constraint on intra-camera and inter-camera ranking lists. (c) CLQE. CLQE averages the weighted CKRNNs vectors of intra-camera and inter-camera k-nearest neighbors to obtain weighted expanded neighbors.
  • Figure 3: (a) average inter-camera proportion, (b) average inter-camera total weight and (c) average neighbor accuracy of all training samples' weighted expanded neighbors vectors over different epochs from baseline, CKRNNs, CLQE and CAJ in clustering scene.
  • Figure 4: Parameter analysis of $k_1^{intra}$, $k_1^{inter}$ and $k_2^{intra}/k_2^{inter}$ on Market1501 and MSMT17.
  • Figure 5: The t-SNE visualization of 10 persons' features extracted by the models of (a) CC and (b) CC+CAJ. Different colors and shapes indicate different identities and camera labels.
  • ...and 3 more figures