SPAGS: Sparse-View Articulated Object Reconstruction from Single State via Planar Gaussian Splatting
Di Wu, Liu Liu, Xueyu Yuan, Qiaojun Yu, Wenxiao Chen, Ruilong Yan, Yiming Tang, Liangtu Song
TL;DR
The paper tackles the problem of high-fidelity articulated object reconstruction from sparse-view RGB inputs, addressing the cost and practicality of multi-view data. It introduces SPAGS, a framework that combines a Gaussian Information Field for optimal sparse-view perception, planar Gaussian Splatting with a coarse-to-fine optimization, few-shot diffusion refinement, and articulation modeling with part-aware Gaussian primitives to recover accurate part-level surfaces. Empirical results on synthetic and real-world data show SPAGS outperforms state-of-the-art sparse-view and two-state articulated-object methods in surface quality, novel-view synthesis, and joint estimation, while maintaining reasonable training times. The work advances practical 3D reconstruction for manipulation and robotics by reducing data requirements and enabling autonomous view selection and robust part-level representations, with limitations noted for transparent and very small objects and directions for future work on physically-based rendering and super-resolution.
Abstract
Articulated objects are ubiquitous in daily environments, and their 3D reconstruction holds great significance across various fields. However, existing articulated object reconstruction methods typically require costly inputs such as multi-stage and multi-view observations. To address the limitations, we propose a category-agnostic articulated object reconstruction framework via planar Gaussian Splatting, which only uses sparse-view RGB images from a single state. Specifically, we first introduce a Gaussian information field to perceive the optimal sparse viewpoints from candidate camera poses. Then we compress 3D Gaussians into planar Gaussians to facilitate accurate estimation of normal and depth. The planar Gaussians are optimized in a coarse-to-fine manner through depth smooth regularization and few-shot diffusion. Moreover, we introduce a part segmentation probability for each Gaussian primitive and update them by back-projecting part segmentation masks of renderings. Extensive experimental results demonstrate that our method achieves higher-fidelity part-level surface reconstruction on both synthetic and real-world data than existing methods. Codes will be made publicly available.
