Learning High-Quality Initial Noise for Single-View Synthesis with Diffusion Models

Zhihao Zhang; Xuejun Yang; Weihua Liu; Mouquan Shen

Learning High-Quality Initial Noise for Single-View Synthesis with Diffusion Models

Zhihao Zhang, Xuejun Yang, Weihua Liu, Mouquan Shen

TL;DR

The paper tackles the quality sensitivity of diffusion-based single-view novel view synthesis by learning a high-quality noise representation. It introduces an inference–inversion-based pipeline to inject semantic information into the initial noise and trains a lightweight encoder–decoder network (EDN) to map random noise to high-quality noise, which plugs into pretrained NVS models without architectural changes. Through a diffusion-model–driven noise collection and filtering stage, the method yields improved multi-view consistency and detail across SV3D and Mv-Adapter on multiple datasets, with negligible inference overhead. This work enables better NVS performance without fine-tuning diffusion architectures, offering a practical, scalable enhancement for diffusion-based 3D view synthesis.

Abstract

Single-view novel view synthesis (NVS) models based on diffusion models have recently attracted increasing attention, as they can generate a series of novel view images from a single image prompt and camera pose information as conditions. It has been observed that in diffusion models, certain high-quality initial noise patterns lead to better generation results than others. However, there remains a lack of dedicated learning frameworks that enable NVS models to learn such high-quality noise. To obtain high-quality initial noise from random Gaussian noise, we make the following contributions. First, we design a discretized Euler inversion method to inject image semantic information into random noise, thereby constructing paired datasets of random and high-quality noise. Second, we propose a learning framework based on an encoder-decoder network (EDN) that directly transforms random noise into high-quality noise. Experiments demonstrate that the proposed EDN can be seamlessly plugged into various NVS models, such as SV3D and MV-Adapter, achieving significant performance improvements across multiple datasets. Code is available at: https://github.com/zhihao0512/EDN.

Learning High-Quality Initial Noise for Single-View Synthesis with Diffusion Models

TL;DR

Abstract

Learning High-Quality Initial Noise for Single-View Synthesis with Diffusion Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)