Table of Contents
Fetching ...

Adaptive Anchor Policies for Efficient 4D Gaussian Streaming

Ashim Dahal, Rabab Abdelfattah, Nick Rahimi

Abstract

Dynamic scene reconstruction with Gaussian Splatting has enabled efficient streaming for real-time rendering and free-viewpoint video. However, most pipelines rely on fixed anchor selection such as Farthest Point Sampling (FPS), typically using 8,192 anchors regardless of scene complexity, which over-allocates computation under strict budgets. We propose Efficient Gaussian Streaming (EGS), a plug-in, budget-aware anchor sampler that replaces FPS with a reinforcement-learned policy while keeping the Gaussian streaming reconstruction backbone unchanged. The policy jointly selects an anchor budget and a subset of informative anchors under discrete constraints, balancing reconstruction quality and runtime using spatial features of the Gaussian representation. We evaluate EGS in two settings: fast rendering, which prioritizes runtime efficiency, and high-quality refinement, which enables additional optimization. Experiments on dynamic multi-view datasets show consistent improvements in the quality--efficiency trade-off over FPS sampling. On unseen data, in fast rendering at 256 anchors ($32\times$ fewer than 8,192), EGS improves PSNR by $+0.52$--$0.61$\,dB while running $1.29$--$1.35\times$ faster than IGS@8192 (N3DV and MeetingRoom). In high-quality refinement, EGS remains competitive with the full-anchor baseline at substantially lower anchor budgets. \emph{Code and pretrained checkpoints will be released upon acceptance.} \keywords{4D Gaussian Splatting \and 4D Gaussian Streaming \and Reinforcement Learning}

Adaptive Anchor Policies for Efficient 4D Gaussian Streaming

Abstract

Dynamic scene reconstruction with Gaussian Splatting has enabled efficient streaming for real-time rendering and free-viewpoint video. However, most pipelines rely on fixed anchor selection such as Farthest Point Sampling (FPS), typically using 8,192 anchors regardless of scene complexity, which over-allocates computation under strict budgets. We propose Efficient Gaussian Streaming (EGS), a plug-in, budget-aware anchor sampler that replaces FPS with a reinforcement-learned policy while keeping the Gaussian streaming reconstruction backbone unchanged. The policy jointly selects an anchor budget and a subset of informative anchors under discrete constraints, balancing reconstruction quality and runtime using spatial features of the Gaussian representation. We evaluate EGS in two settings: fast rendering, which prioritizes runtime efficiency, and high-quality refinement, which enables additional optimization. Experiments on dynamic multi-view datasets show consistent improvements in the quality--efficiency trade-off over FPS sampling. On unseen data, in fast rendering at 256 anchors ( fewer than 8,192), EGS improves PSNR by --\,dB while running -- faster than IGS@8192 (N3DV and MeetingRoom). In high-quality refinement, EGS remains competitive with the full-anchor baseline at substantially lower anchor budgets. \emph{Code and pretrained checkpoints will be released upon acceptance.} \keywords{4D Gaussian Splatting \and 4D Gaussian Streaming \and Reinforcement Learning}
Paper Structure (33 sections, 5 equations, 5 figures, 7 tables)

This paper contains 33 sections, 5 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Fast Rendering Mode. Our RL policy selects anchors that match or improve quality with $32\times$ fewer anchors on unseen data. Left: qualitative comparison (Ours, IGS, 3DGStream). Right: N3DV PSNR vs time/frame trade-off.
  • Figure 2: Method overview of EGS. Our adaptive sampler selects a budgeted set of anchor Gaussians (stochastic during training; deterministic at inference) using point-MLP embeddings and a lightweight Transformer vaswani2017attentionlee2019settransformer, then feeds the selected anchors to the frozen IGS pipeline for anchor-graph construction and rasterization yan2025instant. The training objective trades off sparsity, runtime, and PSNR against FPS-based teacher targets qi2017pointnetplusplusgonzalez1985clustering.
  • Figure 3: Training dynamics and analysis. Across training, the sampler learns stable budget control and converges under constraint-aware rewards; final checkpoints show positive $\Delta$PSNR vs IGS@8192 on seen and unseen splits. Training curves use stochastic actions and may differ from deterministic inference.
  • Figure 4: Qualitative comparison on N3DV and MeetingRoom scenes. We compare our method against IGS, 3DGStream, and ground truth across representative scenes.
  • Figure 5: Motion progression on MeetingRoom:Trimming. Reconstructions across frames demonstrating temporal consistency.