SuperGS: Super-Resolution 3D Gaussian Splatting Enhanced by Variational Residual Features and Uncertainty-Augmented Learning

Shiyun Xie; Zhiru Wang; Xu Wang; Yinghao Zhu; Chengwei Pan; Xiwang Dong

SuperGS: Super-Resolution 3D Gaussian Splatting Enhanced by Variational Residual Features and Uncertainty-Augmented Learning

Shiyun Xie, Zhiru Wang, Xu Wang, Yinghao Zhu, Chengwei Pan, Xiwang Dong

TL;DR

SuperGS tackles the challenge of high-resolution novel view synthesis from low-resolution inputs by extending 3D Gaussian Splatting with a two-stage coarse-to-fine training regime. A latent feature field learned in the coarse stage initializes a fine-stage optimization that attaches variational residual features to Gaussian primitives, enabling high-frequency detail while capturing uncertainty. Multi-view joint learning and uncertainty-guided losses mitigate ambiguities from inconsistent pseudo-labels and guide densification toward faithful reconstruction. Extensive experiments on real-world and synthetic data show that SuperGS achieves superior HRNVS quality and reduces artifacts compared to prior 3DGS and SR-based methods, with the added benefit of providing uncertainty estimates for robust rendering.

Abstract

Recently, 3D Gaussian Splatting (3DGS) has exceled in novel view synthesis (NVS) with its real-time rendering capabilities and superior quality. However, it faces challenges for high-resolution novel view synthesis (HRNVS) due to the coarse nature of primitives derived from low-resolution input views. To address this issue, we propose Super-Resolution 3DGS (SuperGS), which is an expansion of 3DGS designed with a two-stage coarse-to-fine training framework. In this framework, we use a latent feature field to represent the low-resolution scene, serving as both the initialization and foundational information for super-resolution optimization. Additionally, we introduce variational residual features to enhance high-resolution details, using their variance as uncertainty estimates to guide the densification process and loss computation. Furthermore, the introduction of a multi-view joint learning approach helps mitigate ambiguities caused by multi-view inconsistencies in the pseudo labels. Extensive experiments demonstrate that SuperGS surpasses state-of-the-art HRNVS methods on both real-world and synthetic datasets using only low-resolution inputs. Code is available at https://github.com/SYXieee/SuperGS.

SuperGS: Super-Resolution 3D Gaussian Splatting Enhanced by Variational Residual Features and Uncertainty-Augmented Learning

TL;DR

Abstract

Paper Structure (23 sections, 15 equations, 6 figures, 3 tables)

This paper contains 23 sections, 15 equations, 6 figures, 3 tables.

Introduction
Related Work
Novel View Synthesis
3D Scene Super-Resolution
Methodology
Coarse-stage Training
Fine-stage Training
Variational Residual Feature
Multi-view Joint Learning
Uncertainty-guided Density Control
Uncertainty-guided Loss Function
Experiments
Experimental Setups
Datasets and Metrics
Baselines
...and 8 more sections

Figures (6)

Figure 1: Comparison of 3DGS and SRGS on HRNVS task. 3DGS suffers from erosion effects, and SRGS exhibits significant artifacts, while Our method produces high-fidelity synthesized views with enhanced detail preservation.
Figure 2: Overview of our proposed SuperGS. We propose a two-stage coarse-to-fine framework that first optimizes the scene with low-resolution views as initialization for super-resolution. In the coarse-stage, we introduce a latent feature field in place of the conventional 3DGS pipeline. In the fine-stage, we attach variational residual features to each Gaussian to enhance details and model uncertainty. Additionally, multi-view joint learning and uncertainty help mitigate ambiguities from pseudo labels, improving reconstruction quality.
Figure 3: Illustration of Feature Field. For a specific Gaussian, we identify its voxel across $L$ resolution levels, extracting feature vectors of all voxel vertices from the corresponding hash table and deriving the $l$-th level feature via linear interpolation based on the Gaussian's center. We concatenate these features with SH encoding of direction, and a small MLP is used to decode the view-dependent color, which is finally rendered into an RGB image through a differentiable rasterizer.
Figure 4: Qualitative comparison of the HRNVS ($\times 4$) on real-world datasets. We highlight the difference with colored patches.
Figure 5: Qualitative comparison of the HRNVS ($\times 4$) the synthetic dataset. We highlight the difference with colored patches.
...and 1 more figures

SuperGS: Super-Resolution 3D Gaussian Splatting Enhanced by Variational Residual Features and Uncertainty-Augmented Learning

TL;DR

Abstract

SuperGS: Super-Resolution 3D Gaussian Splatting Enhanced by Variational Residual Features and Uncertainty-Augmented Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (6)