Table of Contents
Fetching ...

HonestFace: Towards Honest Face Restoration with One-Step Diffusion Model

Jingkai Wang, Wu Miao, Jue Gong, Zheng Chen, Xing Liu, Hong Gu, Yutong Liu, Yulun Zhang

TL;DR

HonestFace tackles the challenge of restoring high-quality faces from degraded inputs while preserving identity and natural textures. It introduces an Identity Embedder (IDE) and a Visual Representation Embedder (VRE) to fuse information from the low-quality input and multiple high-quality references, guided by Masked Face Alignment (MFA) and a novel affine landmark distance metric. Operated within a one-step latent diffusion framework, HonestFace also employs adversarial distillation to sharpen results, achieving honest reconstructions without over-smoothing. Experiments on Reface-HQ and CelebRef-HQ demonstrate state-of-the-art performance across identity and perceptual metrics, with public code to facilitate deployment and further research.

Abstract

Face restoration has achieved remarkable advancements through the years of development. However, ensuring that restored facial images exhibit high fidelity, preserve authentic features, and avoid introducing artifacts or biases remains a significant challenge. This highlights the need for models that are more "honest" in their reconstruction from low-quality inputs, accurately reflecting original characteristics. In this work, we propose HonestFace, a novel approach designed to restore faces with a strong emphasis on such honesty, particularly concerning identity consistency and texture realism. To achieve this, HonestFace incorporates several key components. First, we propose an identity embedder to effectively capture and preserve crucial identity features from both the low-quality input and multiple reference faces. Second, a masked face alignment method is presented to enhance fine-grained details and textural authenticity, thereby preventing the generation of patterned or overly synthetic textures and improving overall clarity. Furthermore, we present a new landmark-based evaluation metric. Based on affine transformation principles, this metric improves the accuracy compared to conventional L2 distance calculations for facial feature alignment. Leveraging these contributions within a one-step diffusion model framework, HonestFace delivers exceptional restoration results in terms of facial fidelity and realism. Extensive experiments demonstrate that our approach surpasses existing state-of-the-art methods, achieving superior performance in both visual quality and quantitative assessments. The code and pre-trained models will be made publicly available at https://github.com/jkwang28/HonestFace .

HonestFace: Towards Honest Face Restoration with One-Step Diffusion Model

TL;DR

HonestFace tackles the challenge of restoring high-quality faces from degraded inputs while preserving identity and natural textures. It introduces an Identity Embedder (IDE) and a Visual Representation Embedder (VRE) to fuse information from the low-quality input and multiple high-quality references, guided by Masked Face Alignment (MFA) and a novel affine landmark distance metric. Operated within a one-step latent diffusion framework, HonestFace also employs adversarial distillation to sharpen results, achieving honest reconstructions without over-smoothing. Experiments on Reface-HQ and CelebRef-HQ demonstrate state-of-the-art performance across identity and perceptual metrics, with public code to facilitate deployment and further research.

Abstract

Face restoration has achieved remarkable advancements through the years of development. However, ensuring that restored facial images exhibit high fidelity, preserve authentic features, and avoid introducing artifacts or biases remains a significant challenge. This highlights the need for models that are more "honest" in their reconstruction from low-quality inputs, accurately reflecting original characteristics. In this work, we propose HonestFace, a novel approach designed to restore faces with a strong emphasis on such honesty, particularly concerning identity consistency and texture realism. To achieve this, HonestFace incorporates several key components. First, we propose an identity embedder to effectively capture and preserve crucial identity features from both the low-quality input and multiple reference faces. Second, a masked face alignment method is presented to enhance fine-grained details and textural authenticity, thereby preventing the generation of patterned or overly synthetic textures and improving overall clarity. Furthermore, we present a new landmark-based evaluation metric. Based on affine transformation principles, this metric improves the accuracy compared to conventional L2 distance calculations for facial feature alignment. Leveraging these contributions within a one-step diffusion model framework, HonestFace delivers exceptional restoration results in terms of facial fidelity and realism. Extensive experiments demonstrate that our approach surpasses existing state-of-the-art methods, achieving superior performance in both visual quality and quantitative assessments. The code and pre-trained models will be made publicly available at https://github.com/jkwang28/HonestFace .

Paper Structure

This paper contains 16 sections, 15 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Overall training pipeline of HonestFace. First, the LQ input $x_L$ is encoded into $z_L$ by the VAE encoder $E_\phi$. Meanwhile, $x_L$ and HQ references $R = \{r_i\}$ pass through IDE and VRE, then fused to form the prompt embedding $p$. Next, the UNet predicts $\varepsilon_\theta$ to estimate $\hat{z}_H$. Finally, the VAE decoder $D_\phi$ reconstructs the output $\hat{x}_H$. Generator and discriminator are trained alternately.
  • Figure 2: Facial feature extractor.
  • Figure 3: Illustration of face identity encoder.
  • Figure 4: Masked face alignment.
  • Figure 5: Visual comparison of CelebHQRef-Test. Please zoom in for a better view.
  • ...and 2 more figures