Table of Contents
Fetching ...

GaussianSR: 3D Gaussian Super-Resolution with 2D Diffusion Priors

Xiqian Yu, Hanxin Zhu, Tianyu He, Zhibo Chen

TL;DR

GaussianSR addresses the challenge of high-resolution novel view synthesis (HRNVS) from only low-resolution inputs by distilling 2D diffusion priors into 3D Gaussian Splatting (3DGS). The method uses Score Distillation Sampling (SDS) to guide high-resolution 3DGS optimization, but SDS can introduce stochastic disturbances leading to redundant Gaussian primitives; two strategies—diffusion timestep annealing and Gaussian dropout—mitigate these effects. Experiments on Blender, Mip-NeRF 360, and Deep Blending demonstrate superior PSNR/SSIM/LPIPS and competitive rendering speed compared to state-of-the-art baselines. Overall, GaussianSR enables high-quality HRNVS from LR data with fast rendering, expanding practical HRNVS applications while highlighting the potential of integrating 2D diffusion priors into 3D representations.

Abstract

Achieving high-resolution novel view synthesis (HRNVS) from low-resolution input views is a challenging task due to the lack of high-resolution data. Previous methods optimize high-resolution Neural Radiance Field (NeRF) from low-resolution input views but suffer from slow rendering speed. In this work, we base our method on 3D Gaussian Splatting (3DGS) due to its capability of producing high-quality images at a faster rendering speed. To alleviate the shortage of data for higher-resolution synthesis, we propose to leverage off-the-shelf 2D diffusion priors by distilling the 2D knowledge into 3D with Score Distillation Sampling (SDS). Nevertheless, applying SDS directly to Gaussian-based 3D super-resolution leads to undesirable and redundant 3D Gaussian primitives, due to the randomness brought by generative priors. To mitigate this issue, we introduce two simple yet effective techniques to reduce stochastic disturbances introduced by SDS. Specifically, we 1) shrink the range of diffusion timestep in SDS with an annealing strategy; 2) randomly discard redundant Gaussian primitives during densification. Extensive experiments have demonstrated that our proposed GaussainSR can attain high-quality results for HRNVS with only low-resolution inputs on both synthetic and real-world datasets. Project page: https://chchnii.github.io/GaussianSR/

GaussianSR: 3D Gaussian Super-Resolution with 2D Diffusion Priors

TL;DR

GaussianSR addresses the challenge of high-resolution novel view synthesis (HRNVS) from only low-resolution inputs by distilling 2D diffusion priors into 3D Gaussian Splatting (3DGS). The method uses Score Distillation Sampling (SDS) to guide high-resolution 3DGS optimization, but SDS can introduce stochastic disturbances leading to redundant Gaussian primitives; two strategies—diffusion timestep annealing and Gaussian dropout—mitigate these effects. Experiments on Blender, Mip-NeRF 360, and Deep Blending demonstrate superior PSNR/SSIM/LPIPS and competitive rendering speed compared to state-of-the-art baselines. Overall, GaussianSR enables high-quality HRNVS from LR data with fast rendering, expanding practical HRNVS applications while highlighting the potential of integrating 2D diffusion priors into 3D representations.

Abstract

Achieving high-resolution novel view synthesis (HRNVS) from low-resolution input views is a challenging task due to the lack of high-resolution data. Previous methods optimize high-resolution Neural Radiance Field (NeRF) from low-resolution input views but suffer from slow rendering speed. In this work, we base our method on 3D Gaussian Splatting (3DGS) due to its capability of producing high-quality images at a faster rendering speed. To alleviate the shortage of data for higher-resolution synthesis, we propose to leverage off-the-shelf 2D diffusion priors by distilling the 2D knowledge into 3D with Score Distillation Sampling (SDS). Nevertheless, applying SDS directly to Gaussian-based 3D super-resolution leads to undesirable and redundant 3D Gaussian primitives, due to the randomness brought by generative priors. To mitigate this issue, we introduce two simple yet effective techniques to reduce stochastic disturbances introduced by SDS. Specifically, we 1) shrink the range of diffusion timestep in SDS with an annealing strategy; 2) randomly discard redundant Gaussian primitives during densification. Extensive experiments have demonstrated that our proposed GaussainSR can attain high-quality results for HRNVS with only low-resolution inputs on both synthetic and real-world datasets. Project page: https://chchnii.github.io/GaussianSR/
Paper Structure (30 sections, 5 equations, 12 figures, 7 tables)

This paper contains 30 sections, 5 equations, 12 figures, 7 tables.

Figures (12)

  • Figure 1: Overview of GaussianSR. To alleviate the lack of high-resolution data, we synthesize high-resolution novel views by distilling 2D diffusion priors into 3D representation with SDS (Sec. \ref{['3.1']}). Since the redundant Gaussian primitives are introduced due to the randomness of generative priors (Sec. \ref{['3.2']}), we propose Gaussian Dropout and diffusion timestep annealing to reduce stochastic disturbance (Sec. \ref{['3.3']}).
  • Figure 2: (a) The gradient values under the constraint of SDS loss are visualized, revealing substantial variance across different diffusion timesteps $t$. (b) When comparing the gradient values under two different constraints—SDS loss on the left and MSE loss on the right—the gradient variance for SDS is significantly larger than that for MSE.
  • Figure 3: Illustration of Gaussian Dropout during the densification process. When a small-scale object (depicted by the black outline) is insufficiently covered (under-reconstructed) or is represented by overly large splats (over-reconstructed), cloning or splitting is performed. In the top row (without dropout), a redundant Gaussian primitive (shown in green) is generated during densification. In the bottom row (with dropout), the redundant Gaussian primitive is randomly discarded.
  • Figure 4: Qualitative comparison of the HRNVS ($\times4$) on Blender dataset. Our method shows clearer details than 3DGS kerbl3Dgaussians, Bicubic, NeRF-SR wang2022nerf and StableSR wang2023exploiting.
  • Figure 5: Qualitative comparison of our method with vanilla 3DGS, bicubic interpolation, and StableSR on Mip-NeRF 360 and Deep Blending Dataset for the HRNVS ($\times4$). The results are the zoom-in version of the red box region and the PNSR value for the current view is presented in the top right corner. Our method presents higher quality and clearer details than others.
  • ...and 7 more figures