Table of Contents
Fetching ...

SkipGS: Post-Densification Backward Skipping for Efficient 3DGS Training

Jingxing Li, Yongjae Leeand, Deliang Fan

TL;DR

This work proposes SkipGS with a novel view-adaptive backward gating mechanism for efficient post-densification training, which reduces end-to-end training time by 23.1%, driven by a 42.0% reduction in post-densification time, with comparable reconstruction quality.

Abstract

3D Gaussian Splatting (3DGS) achieves real-time novel-view synthesis by optimizing millions of anisotropic Gaussians, yet its training remains expensive, with the backward pass dominating runtime in the post-densification refinement phase. We observe substantial update redundancy in this phase: many sampled views have near-plateaued losses and provide diminishing gradient benefits, but standard training still runs full backpropagation. We propose SkipGS with a novel view-adaptive backward gating mechanism for efficient post-densification training. SkipGS always performs the forward pass to update per-view loss statistics, and selectively skips backward passes when the sampled view's loss is consistent with its recent per-view baseline, while enforcing a minimum backward budget for stable optimization. On Mip-NeRF 360, compared to 3DGS, SkipGS reduces end-to-end training time by 23.1%, driven by a 42.0% reduction in post-densification time, with comparable reconstruction quality. Because it only changes when to backpropagate -- without modifying the renderer, representation, or loss -- SkipGS is plug-and-play and compatible with other complementary efficiency strategies for additive speedups.

SkipGS: Post-Densification Backward Skipping for Efficient 3DGS Training

TL;DR

This work proposes SkipGS with a novel view-adaptive backward gating mechanism for efficient post-densification training, which reduces end-to-end training time by 23.1%, driven by a 42.0% reduction in post-densification time, with comparable reconstruction quality.

Abstract

3D Gaussian Splatting (3DGS) achieves real-time novel-view synthesis by optimizing millions of anisotropic Gaussians, yet its training remains expensive, with the backward pass dominating runtime in the post-densification refinement phase. We observe substantial update redundancy in this phase: many sampled views have near-plateaued losses and provide diminishing gradient benefits, but standard training still runs full backpropagation. We propose SkipGS with a novel view-adaptive backward gating mechanism for efficient post-densification training. SkipGS always performs the forward pass to update per-view loss statistics, and selectively skips backward passes when the sampled view's loss is consistent with its recent per-view baseline, while enforcing a minimum backward budget for stable optimization. On Mip-NeRF 360, compared to 3DGS, SkipGS reduces end-to-end training time by 23.1%, driven by a 42.0% reduction in post-densification time, with comparable reconstruction quality. Because it only changes when to backpropagate -- without modifying the renderer, representation, or loss -- SkipGS is plug-and-play and compatible with other complementary efficiency strategies for additive speedups.
Paper Structure (23 sections, 13 equations, 4 figures, 4 tables, 1 algorithm)

This paper contains 23 sections, 13 equations, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: SkipGS accelerates diverse 3DGS pipelines with negligible quality loss.Left: Qualitative comparison on the garden scene (Mip-NeRF 360 barron2022mipnerf360). Baseline and +SkipGS renderings are visually indistinguishable. Right: PSNR vs. training time for all six baselines.
  • Figure 2: Profiling vanilla 3DGS training on the Kitchen scene (Mip-NeRF 360).(a) Per-iteration time breakdown: the backward pass dominates (${\sim}62\%$) after densification stops at $T_d{=}15$k, motivating backward-level acceleration. (b) Per-Gaussian gradient norms (blue, left axis) decrease ${\sim}2{\times}$ from early to late training and become nearly flat after $T_d$, while Adam update norms (red, right axis) remain comparatively stable (only ${\sim}1.2{\times}$ reduction overall) due to momentum inertia, suggesting many post-densification updates are weakly informative and can be reduced by selective backpropagation. In (b), both norms are normalized by their respective values at iteration $T_d{=}15$k for cross-quantity comparability.
  • Figure 3: Overview of SkipGS.Top: Training timeline. SkipGS activates after densification stops at $T_d$: a warmup window ($W$ iterations) initializes per-view exponential moving average (EMA) baseline and calibrates the minimum backward budget $\rho_{\min}$, after which constrained backward gating begins. Bottom: Per-iteration decision flow during the backward gating phase. At each iteration, the forward pass is always executed to compute the loss and update the per-view EMA $\bar{\mathcal{L}}_v^{(t)}$ (unconditional). A deviation score$s = \mathcal{L}_v^{(t)} / (\bar{\mathcal{L}}_v^{(t-1)} + \epsilon)$ measures whether the current loss exceeds the recent per-view baseline (Sec. \ref{['sec:convergence-signal']}). If $s > 1$, the backward pass is executed; otherwise, SkipGS proposes to skip. Before skipping, the budget controller checks whether the cumulative backward ratio $\rho_{\mathrm{cum}}$ has fallen below the auto-calibrated minimum $\rho_{\min}$, and forces backward execution if so (Sec. \ref{['sec:budget']}). Only when both checks allow skipping is the backward pass omitted.
  • Figure 4: Qualitative comparison across datasets and baselines. Each row shows a different dataset and baseline method. Despite substantial reductions in post-densification time, SkipGS produces visually indistinguishable results from the corresponding full-training baseline across all settings.