EcoSplat: Efficiency-controllable Feed-forward 3D Gaussian Splatting from Multi-view Images
Jongmin Park, Minh-Quan Viet Bui, Juan Luis Gonzalez Bello, Jaeho Moon, Jihyong Oh, Munchurl Kim
TL;DR
<3-5 sentence high-level summary> EcoSplat tackles the challenge of real-time, high-quality novel view synthesis from multi-view images under explicit efficiency constraints. It introduces a two-stage, feed-forward 3D Gaussian Splatting framework that learns pixel-aligned Gaussians (PGT) and then finetunes them to be K-aware through an importance-aware mechanism and progressive compaction (IGF/PLGC), enabling top-K Gaussian selection at inference. The method allocates per-view primitives based on view importance and high-frequency content to achieve robust rendering with far fewer Gaussians than prior methods, outperforming state-of-the-art baselines on RealEstate10K and ACID. This explicit primitive-budget control opens up practical deployment for resource-constrained devices and flexible downstream rendering tasks.
Abstract
Feed-forward 3D Gaussian Splatting (3DGS) enables efficient one-pass scene reconstruction, providing 3D representations for novel view synthesis without per-scene optimization. However, existing methods typically predict pixel-aligned primitives per-view, producing an excessive number of primitives in dense-view settings and offering no explicit control over the number of predicted Gaussians. To address this, we propose EcoSplat, the first efficiency-controllable feed-forward 3DGS framework that adaptively predicts the 3D representation for any given target primitive count at inference time. EcoSplat adopts a two-stage optimization process. The first stage is Pixel-aligned Gaussian Training (PGT) where our model learns initial primitive prediction. The second stage is Importance-aware Gaussian Finetuning (IGF) stage where our model learns rank primitives and adaptively adjust their parameters based on the target primitive count. Extensive experiments across multiple dense-view settings show that EcoSplat is robust and outperforms state-of-the-art methods under strict primitive-count constraints, making it well-suited for flexible downstream rendering tasks.
