Table of Contents
Fetching ...

SRGS: Super-Resolution 3D Gaussian Splatting

Xiang Feng, Yongbo He, Yubo Wang, Yan Yang, Wen Li, Yifei Chen, Zhenzhong Kuang, Jiajun ding, Jianping Fan, Yu Jun

TL;DR

The paper tackles high-resolution novel view synthesis with a 3D Gaussian Splatting framework by introducing SRGS, which densifies Gaussian primitives in high-resolution space and leverages a pretrained 2D super-resolution model to learn faithful textures. It combines a sub-pixel constraint from LR views with a texture-guided learning signal to produce denser, texture-rich primitives that approach HR ground-truth quality. Empirical results on Synthetic NeRF, Tanks & Temples, and Mip-NeRF 360 demonstrate that SRGS outperforms prior methods and narrows the gap to HR-3DGS, while ablation studies validate the effectiveness of densification and external texture priors. The approach offers a practical path to HRNVS using only LR data, with potential limitations tied to the quality of the 2D SR priors and future work aimed at reducing reliance on external models.

Abstract

Recently, 3D Gaussian Splatting (3DGS) has gained popularity as a novel explicit 3D representation. This approach relies on the representation power of Gaussian primitives to provide a high-quality rendering. However, primitives optimized at low resolution inevitably exhibit sparsity and texture deficiency, posing a challenge for achieving high-resolution novel view synthesis (HRNVS). To address this problem, we propose Super-Resolution 3D Gaussian Splatting (SRGS) to perform the optimization in a high-resolution (HR) space. The sub-pixel constraint is introduced for the increased viewpoints in HR space, exploiting the sub-pixel cross-view information of the multiple low-resolution (LR) views. The gradient accumulated from more viewpoints will facilitate the densification of primitives. Furthermore, a pre-trained 2D super-resolution model is integrated with the sub-pixel constraint, enabling these dense primitives to learn faithful texture features. In general, our method focuses on densification and texture learning to effectively enhance the representation ability of primitives. Experimentally, our method achieves high rendering quality on HRNVS only with LR inputs, outperforming state-of-the-art methods on challenging datasets such as Mip-NeRF 360 and Tanks & Temples. Related codes will be released upon acceptance.

SRGS: Super-Resolution 3D Gaussian Splatting

TL;DR

The paper tackles high-resolution novel view synthesis with a 3D Gaussian Splatting framework by introducing SRGS, which densifies Gaussian primitives in high-resolution space and leverages a pretrained 2D super-resolution model to learn faithful textures. It combines a sub-pixel constraint from LR views with a texture-guided learning signal to produce denser, texture-rich primitives that approach HR ground-truth quality. Empirical results on Synthetic NeRF, Tanks & Temples, and Mip-NeRF 360 demonstrate that SRGS outperforms prior methods and narrows the gap to HR-3DGS, while ablation studies validate the effectiveness of densification and external texture priors. The approach offers a practical path to HRNVS using only LR data, with potential limitations tied to the quality of the 2D SR priors and future work aimed at reducing reliance on external models.

Abstract

Recently, 3D Gaussian Splatting (3DGS) has gained popularity as a novel explicit 3D representation. This approach relies on the representation power of Gaussian primitives to provide a high-quality rendering. However, primitives optimized at low resolution inevitably exhibit sparsity and texture deficiency, posing a challenge for achieving high-resolution novel view synthesis (HRNVS). To address this problem, we propose Super-Resolution 3D Gaussian Splatting (SRGS) to perform the optimization in a high-resolution (HR) space. The sub-pixel constraint is introduced for the increased viewpoints in HR space, exploiting the sub-pixel cross-view information of the multiple low-resolution (LR) views. The gradient accumulated from more viewpoints will facilitate the densification of primitives. Furthermore, a pre-trained 2D super-resolution model is integrated with the sub-pixel constraint, enabling these dense primitives to learn faithful texture features. In general, our method focuses on densification and texture learning to effectively enhance the representation ability of primitives. Experimentally, our method achieves high rendering quality on HRNVS only with LR inputs, outperforming state-of-the-art methods on challenging datasets such as Mip-NeRF 360 and Tanks & Temples. Related codes will be released upon acceptance.
Paper Structure (17 sections, 12 equations, 9 figures, 4 tables)

This paper contains 17 sections, 12 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Our method significantly enhances the representation power of Gaussian primitives, achieving a rendering quality close to HR-3DGS.
  • Figure 2: Overview of the proposed method. SRGS is composed of Super-Resolution Gaussian Densification (SRGD) and Texture-Guided Gaussian Learning (TGGL). In SRGD, the radiance field optimization is performed in the HR space through the super-splatting method. The sub-pixel constraint $\mathcal{L}_{sp}$ supervises the increased sampling points in the HR space. The gradient from more sampling points promotes Gaussian densification through cloning and splitting. In TGGL, a 2D SR model $\mathcal{M}_{s}$ is utilized to generate HR reference views $\mathcal{I}_{ref}$, providing detailed textures absent from $\mathcal{I}_{LR}$. With the joint optimization of $\mathcal{I}_{sp}$ and $\mathcal{I}_{tex}$, the Gaussian primitives tend to learn faithful texture features from the external priors.
  • Figure 3: Even if a large primitive is well trained in LR space, it tends to cause blurring artifacts when upscaled to HR space. To preserve fine-grained details at high resolutions, it is crucial to use more primitives during the reconstruction process.
  • Figure 4: Qualitative comparison of the HRNVS ($\times 4$) on the Synthetic NeRF and Tanks & Temples datasets. The results are the zoom-in version of the green box region. SRGS (Ours) shows clearer details than TensoRF, NeRF-SR, 3DGS, and Mip-Splatting.
  • Figure 5: Qualitative comparison of the HRNVS ($\times 8$) on the Mip-NeRF 360 dataset. Our method shows clearer details than Mip-NeRF 360, Zip-NeRF, 3DGS (baseline) and Mip-Splatting.
  • ...and 4 more figures