Table of Contents
Fetching ...

CodeFormer++: Blind Face Restoration Using Deformable Registration and Deep Metric Learning

Venkata Bharath Reddy Reddem, Akshay P Sarashetti, Ranjith Merugu, Amit Satish Unde

TL;DR

Blind face restoration often sacrifices identity fidelity for perceptual quality when relying on generative priors. CodeFormer++ introduces a modular pipeline that first semantically aligns a high-quality generative prior to an identity-preserving restoration, then injects texture via a texture-prior network, and finally reinforces identity-texture fusion through deep metric learning with a novel anchor–positive sampling strategy. The approach comprises a Deformable Image Alignment Module (DAM), a Texture-Prior Guided Restoration Network (TGRN) with a Texture Attention Module, and a deep metric learning objective to balance realism and identity fidelity. Extensive experiments on synthetic and real-world datasets show superior perceptual quality and identity preservation compared with state-of-the-art methods, along with demonstrated generalization to other priors.

Abstract

Blind face restoration (BFR) has attracted increasing attention with the rise of generative methods. Most existing approaches integrate generative priors into the restoration pro- cess, aiming to jointly address facial detail generation and identity preservation. However, these methods often suffer from a trade-off between visual quality and identity fidelity, leading to either identity distortion or suboptimal degradation removal. In this paper, we present CodeFormer++, a novel framework that maximizes the utility of generative priors for high-quality face restoration while preserving identity. We decompose BFR into three sub-tasks: (i) identity- preserving face restoration, (ii) high-quality face generation, and (iii) dynamic fusion of identity features with realistic texture details. Our method makes three key contributions: (1) a learning-based deformable face registration module that semantically aligns generated and restored faces; (2) a texture guided restoration network to dynamically extract and transfer the texture of generated face to boost the quality of identity-preserving restored face; and (3) the integration of deep metric learning for BFR with the generation of informative positive and hard negative samples to better fuse identity- preserving and generative features. Extensive experiments on real-world and synthetic datasets demonstrate that, the pro- posed CodeFormer++ achieves superior performance in terms of both visual fidelity and identity consistency.

CodeFormer++: Blind Face Restoration Using Deformable Registration and Deep Metric Learning

TL;DR

Blind face restoration often sacrifices identity fidelity for perceptual quality when relying on generative priors. CodeFormer++ introduces a modular pipeline that first semantically aligns a high-quality generative prior to an identity-preserving restoration, then injects texture via a texture-prior network, and finally reinforces identity-texture fusion through deep metric learning with a novel anchor–positive sampling strategy. The approach comprises a Deformable Image Alignment Module (DAM), a Texture-Prior Guided Restoration Network (TGRN) with a Texture Attention Module, and a deep metric learning objective to balance realism and identity fidelity. Extensive experiments on synthetic and real-world datasets show superior perceptual quality and identity preservation compared with state-of-the-art methods, along with demonstrated generalization to other priors.

Abstract

Blind face restoration (BFR) has attracted increasing attention with the rise of generative methods. Most existing approaches integrate generative priors into the restoration pro- cess, aiming to jointly address facial detail generation and identity preservation. However, these methods often suffer from a trade-off between visual quality and identity fidelity, leading to either identity distortion or suboptimal degradation removal. In this paper, we present CodeFormer++, a novel framework that maximizes the utility of generative priors for high-quality face restoration while preserving identity. We decompose BFR into three sub-tasks: (i) identity- preserving face restoration, (ii) high-quality face generation, and (iii) dynamic fusion of identity features with realistic texture details. Our method makes three key contributions: (1) a learning-based deformable face registration module that semantically aligns generated and restored faces; (2) a texture guided restoration network to dynamically extract and transfer the texture of generated face to boost the quality of identity-preserving restored face; and (3) the integration of deep metric learning for BFR with the generation of informative positive and hard negative samples to better fuse identity- preserving and generative features. Extensive experiments on real-world and synthetic datasets demonstrate that, the pro- posed CodeFormer++ achieves superior performance in terms of both visual fidelity and identity consistency.

Paper Structure

This paper contains 15 sections, 14 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Given a degraded face image, our method is able to reconstruct a high-fidelity, texture-rich image. In contrast, CodeFormer fails to completely remove the degradation and tends to produce overly smoothed results. Although generative prior CF-GP generates images with realistic textures, it suffer from identity preservation issues.
  • Figure 2: Overview of our CodeFormer++ framework. In stage-1, Deformable image Alignment Module (DAM) is trained to predict deformation field between $I_F$ and $I_G$. In stage-2, Texture-prior Guided Restoration Network (TGRN) is trained to generate texture-rich and high-fidelity output by injecting texture from $I_{warp}$. The hard positive sample $I_{AP}$ is obtained by combining facial components from $I_F$ and texture from $I_{warp}$ to enforce optimal balance between realism and fidelity. TGRN is supervised using deep metric learning to focus on extracting texture from $I_{AP}$ by pulling anchor towards positive image and away from negative image.
  • Figure 3: The architecture of texture attention module.
  • Figure 4: Qualitative comparisons on CelebA-Test dataset. Zoom in for best view.
  • Figure 5: Qualitative comparisons on LFW-Test, WebPhoto-Test and WIDER-Test datasets. Zoom in for best view.
  • ...and 5 more figures