Table of Contents
Fetching ...

EcoSplat: Efficiency-controllable Feed-forward 3D Gaussian Splatting from Multi-view Images

Jongmin Park, Minh-Quan Viet Bui, Juan Luis Gonzalez Bello, Jaeho Moon, Jihyong Oh, Munchurl Kim

TL;DR

<3-5 sentence high-level summary> EcoSplat tackles the challenge of real-time, high-quality novel view synthesis from multi-view images under explicit efficiency constraints. It introduces a two-stage, feed-forward 3D Gaussian Splatting framework that learns pixel-aligned Gaussians (PGT) and then finetunes them to be K-aware through an importance-aware mechanism and progressive compaction (IGF/PLGC), enabling top-K Gaussian selection at inference. The method allocates per-view primitives based on view importance and high-frequency content to achieve robust rendering with far fewer Gaussians than prior methods, outperforming state-of-the-art baselines on RealEstate10K and ACID. This explicit primitive-budget control opens up practical deployment for resource-constrained devices and flexible downstream rendering tasks.

Abstract

Feed-forward 3D Gaussian Splatting (3DGS) enables efficient one-pass scene reconstruction, providing 3D representations for novel view synthesis without per-scene optimization. However, existing methods typically predict pixel-aligned primitives per-view, producing an excessive number of primitives in dense-view settings and offering no explicit control over the number of predicted Gaussians. To address this, we propose EcoSplat, the first efficiency-controllable feed-forward 3DGS framework that adaptively predicts the 3D representation for any given target primitive count at inference time. EcoSplat adopts a two-stage optimization process. The first stage is Pixel-aligned Gaussian Training (PGT) where our model learns initial primitive prediction. The second stage is Importance-aware Gaussian Finetuning (IGF) stage where our model learns rank primitives and adaptively adjust their parameters based on the target primitive count. Extensive experiments across multiple dense-view settings show that EcoSplat is robust and outperforms state-of-the-art methods under strict primitive-count constraints, making it well-suited for flexible downstream rendering tasks.

EcoSplat: Efficiency-controllable Feed-forward 3D Gaussian Splatting from Multi-view Images

TL;DR

<3-5 sentence high-level summary> EcoSplat tackles the challenge of real-time, high-quality novel view synthesis from multi-view images under explicit efficiency constraints. It introduces a two-stage, feed-forward 3D Gaussian Splatting framework that learns pixel-aligned Gaussians (PGT) and then finetunes them to be K-aware through an importance-aware mechanism and progressive compaction (IGF/PLGC), enabling top-K Gaussian selection at inference. The method allocates per-view primitives based on view importance and high-frequency content to achieve robust rendering with far fewer Gaussians than prior methods, outperforming state-of-the-art baselines on RealEstate10K and ACID. This explicit primitive-budget control opens up practical deployment for resource-constrained devices and flexible downstream rendering tasks.

Abstract

Feed-forward 3D Gaussian Splatting (3DGS) enables efficient one-pass scene reconstruction, providing 3D representations for novel view synthesis without per-scene optimization. However, existing methods typically predict pixel-aligned primitives per-view, producing an excessive number of primitives in dense-view settings and offering no explicit control over the number of predicted Gaussians. To address this, we propose EcoSplat, the first efficiency-controllable feed-forward 3DGS framework that adaptively predicts the 3D representation for any given target primitive count at inference time. EcoSplat adopts a two-stage optimization process. The first stage is Pixel-aligned Gaussian Training (PGT) where our model learns initial primitive prediction. The second stage is Importance-aware Gaussian Finetuning (IGF) stage where our model learns rank primitives and adaptively adjust their parameters based on the target primitive count. Extensive experiments across multiple dense-view settings show that EcoSplat is robust and outperforms state-of-the-art methods under strict primitive-count constraints, making it well-suited for flexible downstream rendering tasks.

Paper Structure

This paper contains 17 sections, 13 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: (a) EcoSplat is a feed-forward 3D Gaussian Splatting framework that enables explicit control over the number of output primitives. (b) It consistently outperforms state-of-the-art methods across a wide range of target primitive counts.
  • Figure 2: Overview of EcoSplat. EcoSplat is trained in two stages: Pixel-aligned Gaussian Training (PGT) (Sec. \ref{['sec:architecture']}) and Importance-aware Gaussian Finetuning (IGF) (Sec. \ref{['sec:IGF']}). During IGF, the combination of the importance-aware opacity loss $\mathcal{L}_\text{io}$ and the Progressive Learning on Gaussian Compaction (PLGC) encourages EcoSplat to suppress the opacities of less important Gaussians. At inference, it adaptively satisfies an arbitrary user-specified primitive count and produces the optimal Gaussians in a feed-forward manner (Sec. \ref{['sec:inference']}).
  • Figure 3: Visual comparison of NVS under controlled numbers of primitives on the RE10K dataset zhou2018stereo with 24 input views.
  • Figure 4: Visual results of ablation study: (a) the PGT stage; (b) the importance-aware opacity loss; and (c) the PLGC strategy.
  • Figure 5: Opacity distributions of pixel-aligned Gaussians across target primitive counts $K$. The blue and orange curves show the distributions when the target primitive count is set to 5% and 70% of the total number of pixel-aligned Gaussians, respectively. EcoSplat adaptively predicts Gaussian opacities based on the primitive budget and selects the top-$K$ Gaussians.
  • ...and 1 more figures