Table of Contents
Fetching ...

Teacher-Student Network for Real-World Face Super-Resolution with Progressive Embedding of Edge Information

Zhilei Liu, Chenggong Zhang

TL;DR

The paper tackles real-world face super-resolution by addressing the domain gap between synthetic and real LR images. It introduces a two-stage teacher-student framework where a degradation network $I_{LR}^{gen}$ bridges the gap and a teacher network is trained on pseudo-pairs, while a student network learns from real LR guided by pseudo HR from the teacher, aided by a cycle-consistent LR constraint. A key novelty is progressive edge information embedding via a recurrent prior-embedded block (PERB), using a Canny edge map at the first iteration and a DoG edge map at the second to recover high-frequency details, with a degradation-aware DSNet enforcing LR-domain consistency. Experiments on FFHQ, Widerface, and WebFace show state-of-the-art performance on both synthetic and real-world datasets, validating improved facial structure, skin color, and detail restoration with robust generalization to real LR data.

Abstract

Traditional face super-resolution (FSR) methods trained on synthetic datasets usually have poor generalization ability for real-world face images. Recent work has utilized complex degradation models or training networks to simulate the real degradation process, but this limits the performance of these methods due to the domain differences that still exist between the generated low-resolution images and the real low-resolution images. Moreover, because of the existence of a domain gap, the semantic feature information of the target domain may be affected when synthetic data and real data are utilized to train super-resolution models simultaneously. In this study, a real-world face super-resolution teacher-student model is proposed, which considers the domain gap between real and synthetic data and progressively includes diverse edge information by using the recurrent network's intermediate outputs. Extensive experiments demonstrate that our proposed approach surpasses state-of-the-art methods in obtaining high-quality face images for real-world FSR.

Teacher-Student Network for Real-World Face Super-Resolution with Progressive Embedding of Edge Information

TL;DR

The paper tackles real-world face super-resolution by addressing the domain gap between synthetic and real LR images. It introduces a two-stage teacher-student framework where a degradation network bridges the gap and a teacher network is trained on pseudo-pairs, while a student network learns from real LR guided by pseudo HR from the teacher, aided by a cycle-consistent LR constraint. A key novelty is progressive edge information embedding via a recurrent prior-embedded block (PERB), using a Canny edge map at the first iteration and a DoG edge map at the second to recover high-frequency details, with a degradation-aware DSNet enforcing LR-domain consistency. Experiments on FFHQ, Widerface, and WebFace show state-of-the-art performance on both synthetic and real-world datasets, validating improved facial structure, skin color, and detail restoration with robust generalization to real LR data.

Abstract

Traditional face super-resolution (FSR) methods trained on synthetic datasets usually have poor generalization ability for real-world face images. Recent work has utilized complex degradation models or training networks to simulate the real degradation process, but this limits the performance of these methods due to the domain differences that still exist between the generated low-resolution images and the real low-resolution images. Moreover, because of the existence of a domain gap, the semantic feature information of the target domain may be affected when synthetic data and real data are utilized to train super-resolution models simultaneously. In this study, a real-world face super-resolution teacher-student model is proposed, which considers the domain gap between real and synthetic data and progressively includes diverse edge information by using the recurrent network's intermediate outputs. Extensive experiments demonstrate that our proposed approach surpasses state-of-the-art methods in obtaining high-quality face images for real-world FSR.
Paper Structure (10 sections, 8 equations, 3 figures, 3 tables)

This paper contains 10 sections, 8 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Pipeline of our proposed method.
  • Figure 2: The unfolded architecture of TNet and SNet.“C”,"+", and “×” denote concatenation, addition, and multiplication, respectively. The green dotted lines represent feedback connections. For simplicity, we omit the activation function layer in the pipeline.
  • Figure 3: Qualitative comparison with state-of-the-art methods on different datasets.