Mutual Distillation Learning For Person Re-Identification

Huiyuan Fu; Kuilong Cui; Chuanming Wang; Mengshi Qi; Huadong Ma

Mutual Distillation Learning For Person Re-Identification

Huiyuan Fu, Kuilong Cui, Chuanming Wang, Mengshi Qi, Huadong Ma

TL;DR

This paper tackles robust person re-identification by uniting two heterogeneous feature extraction perspectives within a single model: a Hard Content Branch that standardizes local features via horizontal partitioning and a Soft Content Branch that learns multi-granularity attention-based features. A mutual distillation and fusion module enables cross-branch knowledge exchange and feature fusion, yielding a richer representation than either branch alone. The approach achieves state-of-the-art or competitive results on multiple benchmarks, notably $mAP=88.7\%$ and $Rank-1=94.4\%$ on DukeMTMC-reID, while also delivering strong performance on Market-1501 and SynergyReID, with code publicly available. This demonstrates the value of integrating complementary cues and mutual learning to improve generalization under pose, occlusion, and background variability.

Abstract

With the rapid advancements in deep learning technologies, person re-identification (ReID) has witnessed remarkable performance improvements. However, the majority of prior works have traditionally focused on solving the problem via extracting features solely from a single perspective, such as uniform partitioning, hard attention mechanisms, or semantic masks. While these approaches have demonstrated efficacy within specific contexts, they fall short in diverse situations. In this paper, we propose a novel approach, Mutual Distillation Learning For Person Re-identification (termed as MDPR), which addresses the challenging problem from multiple perspectives within a single unified model, leveraging the power of mutual distillation to enhance the feature representations collectively. Specifically, our approach encompasses two branches: a hard content branch to extract local features via a uniform horizontal partitioning strategy and a Soft Content Branch to dynamically distinguish between foreground and background and facilitate the extraction of multi-granularity features via a carefully designed attention mechanism. To facilitate knowledge exchange between these two branches, a mutual distillation and fusion process is employed, promoting the capability of the outputs of each branch. Extensive experiments are conducted on widely used person ReID datasets to validate the effectiveness and superiority of our approach. Notably, our method achieves an impressive $88.7\%/94.4\%$ in mAP/Rank-1 on the DukeMTMC-reID dataset, surpassing the current state-of-the-art results. Our source code is available at https://github.com/KuilongCui/MDPR.

Mutual Distillation Learning For Person Re-Identification

TL;DR

and

on DukeMTMC-reID, while also delivering strong performance on Market-1501 and SynergyReID, with code publicly available. This demonstrates the value of integrating complementary cues and mutual learning to improve generalization under pose, occlusion, and background variability.

Abstract

in mAP/Rank-1 on the DukeMTMC-reID dataset, surpassing the current state-of-the-art results. Our source code is available at https://github.com/KuilongCui/MDPR.

Paper Structure (25 sections, 9 equations, 10 figures, 10 tables, 1 algorithm)

This paper contains 25 sections, 9 equations, 10 figures, 10 tables, 1 algorithm.

Introduction
Related Works
Global-based Approach
Attention-based Approach
Pose-based, and Mask-based Approach
Stride-based Approach
Distillation learning
Proposed Method
Hard Content Branch
Soft Content Branch
Knowledge Distillation and Fusion Module
Objective Function
Inference Strategy
Experiment
Dataset and Evaluation
...and 10 more sections

Figures (10)

Figure 1: The overview structure of our method. The upper part corresponds to the Hard Content Branch, which uniformly partitions all images horizontally into two parts. The lower part represents the Soft Content Branch, which leverages attention mechanisms to distinguish individuals from the background. The two branches engage in mutual distillation to enhance their respective feature representation capabilities.
Figure 2: The overall architecture of our proposed network, which consists of two branches and a knowledge distillation and fusion module. The Hard Content Branch applies a uniform partition to the input image, while the Soft Content Branch distinguishes between individuals and the background based on attention. The knowledge distillation and fusion module facilitates mutual distillation learning and integrates the outputs of both branches to further extract meaningful information. BN denotes the batch normalization operation.
Figure 3: The architecture of Hard Content Branch. In our experiments, we partition the features into two segments. The Embedding block is composed of a 1x1 convolution, a batch normalization layer, and a relu activation function. GeM refers to Generalized Mean Pooling. BN refers to batch normalization operation.
Figure 4: The structure of attention generation module $\phi$. $\oplus$ denotes the element-wise sum.
Figure 5: The architecture of Soft Content Branch. $\phi$ refers to the attention generation module. BAP refers to the Bilinear Attention Pooling. Conv refers to a 1x1 convolution layer.
...and 5 more figures

Mutual Distillation Learning For Person Re-Identification

TL;DR

Abstract

Mutual Distillation Learning For Person Re-Identification

Authors

TL;DR

Abstract

Table of Contents

Figures (10)