You Only Need One Stage: Novel-View Synthesis From A Single Blind Face Image

Taoyue Wang; Xiang Zhang; Xiaotian Li; Huiyuan Yang; Lijun Yin

You Only Need One Stage: Novel-View Synthesis From A Single Blind Face Image

Taoyue Wang, Xiang Zhang, Xiaotian Li, Huiyuan Yang, Lijun Yin

TL;DR

A novel one-stage method for generating consistent Novel-View images directly from a single Blind Face image, NVB-Face, that Leveraging the powerful generative capacity of a diffusion model, synthesizes high-quality, consistent novel-view face images.

Abstract

We propose a novel one-stage method, NVB-Face, for generating consistent Novel-View images directly from a single Blind Face image. Existing approaches to novel-view synthesis for objects or faces typically require a high-resolution RGB image as input. When dealing with degraded images, the conventional pipeline follows a two-stage process: first restoring the image to high resolution, then synthesizing novel views from the restored result. However, this approach is highly dependent on the quality of the restored image, often leading to inaccuracies and inconsistencies in the final output. To address this limitation, we extract single-view features directly from the blind face image and introduce a feature manipulator that transforms these features into 3D-aware, multi-view latent representations. Leveraging the powerful generative capacity of a diffusion model, our framework synthesizes high-quality, consistent novel-view face images. Experimental results show that our method significantly outperforms traditional two-stage approaches in both consistency and fidelity.

You Only Need One Stage: Novel-View Synthesis From A Single Blind Face Image

TL;DR

Abstract

Paper Structure (27 sections, 9 equations, 11 figures, 3 tables)

This paper contains 27 sections, 9 equations, 11 figures, 3 tables.

Introduction
Related Work
Blind Image Restoration.
Novel-View Face Synthesis.
Method
Image Restoration
Novel View Synthesis
3D Feature Construction Model.
2D Features Sampling and Aggregation.
Loss Functions
Experiments
Experimental Settings
Datasets.
Training details.
Qualitative Comparisons
...and 12 more sections

Figures (11)

Figure 1: We compare our method with typical two-stage pipelines, such as CodeFormer zhou2022towards + PanoHead-PTI an2023panohead, which first restore the degraded image and then synthesize novel views. It is evident that when the restoration stage fails to recover accurate details, these errors are further amplified during the novel view synthesis, leading to results that deviate significantly from the original identity and appearance. In contrast, our method generates novel views in a single stage directly from the low-quality input. This end-to-end design suppresses error accumulation, resulting in more reliable and faithful novel-view images.
Figure 2: An overview of the proposed NVB-Face architecture. (a) Our first training step focuses solely image restoration. (b) In the second training step, we update only the parameters of the newly introduced modules (highlighted in dark green), keeping the rest of the network frozen. After training, this two-step process forms our complete inference pipeline.
Figure 3: Qualitative comparisons on NeRSemble kirschstein2023nersemble dataset. As shown in our results, this end-to-end strategy achieves superior perceptual quality and preserves identity and expression information more effectively than two-stage methods, minimizing the loss of critical facial attributes.
Figure 4: Qualitative comparisons on LFW-Test huang2008labeled dataset. Our method produces consistently stable results across varying levels of input degradation. Compared to other approaches, our generated images preserve the most information from the original input and exhibit higher visual realism, even under severe degradation.
Figure 5: Qualitative ablation study on LFW-Test dataset to compare our method with and without feature loss.
...and 6 more figures

You Only Need One Stage: Novel-View Synthesis From A Single Blind Face Image

TL;DR

Abstract

You Only Need One Stage: Novel-View Synthesis From A Single Blind Face Image

Authors

TL;DR

Abstract

Table of Contents

Figures (11)