Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition

Junzheng Zhang; Weijia Guo; Bochao Liu; Ruixin Shi; Yong Li; Shiming Ge

Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition

Junzheng Zhang, Weijia Guo, Bochao Liu, Ruixin Shi, Yong Li, Shiming Ge

TL;DR

This work tackles very low-resolution face recognition by introducing a generative-discriminative representation distillation (GDRD) framework. It first distills general generative knowledge from a diffusion-model encoder via $L_{ ext{gen}}$ to teach the backbone, then transfers discriminative knowledge from a high-resolution recognizer through cross-resolution relational distillation with $L_{ ext{rcd}}$, $L_{ ext{kd}}$, and $L_{ ext{cls}}$ to supervise the head, all within a progressive, module-wise training scheme. The authors validate GDRD on four benchmarks, reporting state-of-the-art performance on 16×16 inputs for verification, identification, and retrieval, with robust results under occlusion and illumination changes. The approach offers a practical path to robust low-resolution face recognition by bridging generative detail recovery and discriminative alignment in a two-stage distillation process.

Abstract

Very low-resolution face recognition is challenging due to the serious loss of informative facial details in resolution degradation. In this paper, we propose a generative-discriminative representation distillation approach that combines generative representation with cross-resolution aligned knowledge distillation. This approach facilitates very low-resolution face recognition by jointly distilling generative and discriminative models via two distillation modules. Firstly, the generative representation distillation takes the encoder of a diffusion model pretrained for face super-resolution as the generative teacher to supervise the learning of the student backbone via feature regression, and then freezes the student backbone. After that, the discriminative representation distillation further considers a pretrained face recognizer as the discriminative teacher to supervise the learning of the student head via cross-resolution relational contrastive distillation. In this way, the general backbone representation can be transformed into discriminative head representation, leading to a robust and discriminative student model for very low-resolution face recognition. Our approach improves the recovery of the missing details in very low-resolution faces and achieves better knowledge transfer. Extensive experiments on face datasets demonstrate that our approach enhances the recognition accuracy of very low-resolution faces, showcasing its effectiveness and adaptability.

Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition

TL;DR

to teach the backbone, then transfers discriminative knowledge from a high-resolution recognizer through cross-resolution relational distillation with

, and

to supervise the head, all within a progressive, module-wise training scheme. The authors validate GDRD on four benchmarks, reporting state-of-the-art performance on 16×16 inputs for verification, identification, and retrieval, with robust results under occlusion and illumination changes. The approach offers a practical path to robust low-resolution face recognition by bridging generative detail recovery and discriminative alignment in a two-stage distillation process.

Abstract

Paper Structure (12 sections, 3 equations, 3 figures, 4 tables)

This paper contains 12 sections, 3 equations, 3 figures, 4 tables.

Introduction
Approach
Generative Representation Distillation
Discriminative Representation Distillation
Module-Wise Training
Experiments
Very Low-Resolution Face Verification on LFW
Very Low-Resolution Face Identification on UCCS
Very Low-Resolution Face Retrieval on TinyFace
Evaluation on Resolution and Occlusion Robustness
Ablation Study
Conclusion

Figures (3)

Figure 1: Our generative-discriminative representation distillation (GDRD) progressively trains the student $S=\{S_b,S_h\}$ via two distillation modules. The generative representation distillation trains and freezes the student backbone $S_b$ by distilling the encoder of a pretrained generative teacher $T_{\cmg}$ via feature regression, and the discriminative representation distillation further trains the student head $S_h$ by distilling a pretrained discriminative teacher $T_d$ via relational contrastive distillation.
Figure 2: t-SNE van2008visualizing visualization of representations extracted on UCCS by ArcFace (left) and our student (right).
Figure 3: Ablation study of verification accuracy (%) on LFW.

Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition

TL;DR

Abstract

Distilling Generative-Discriminative Representations for Very Low-Resolution Face Recognition

Authors

TL;DR

Abstract

Table of Contents

Figures (3)