Segment Any 4D Gaussians
Shengxiang Ji, Guanjun Wu, Jiemin Fang, Jiazhong Cen, Taoran Yi, Wenyu Liu, Qi Tian, Xinggang Wang
TL;DR
This work tackles open-world 4D segmentation for dynamic scenes by extending 4D Gaussian Splatting with SA4D, which introduces a temporal identity feature field and a 4D segmentation refinement to address Gaussian drifting. By coupling a lightweight identity encoding network and a Gaussian identity table with a deformable 4D-GS representation, the approach achieves fast, high-quality 4D segmentation and enables interactive editing tasks such as removal, recoloring, and composition. It demonstrates strong improvements over 3D baselines on HyperNeRF and Neu3D and showcases practical editing capabilities, while transparently discussing limitations and potential future work in multi-view identity handling. Overall, SA4D provides a practical, open-world framework for open-set 4D scene understanding and manipulation.
Abstract
Modeling, understanding, and reconstructing the real world are crucial in XR/VR. Recently, 3D Gaussian Splatting (3D-GS) methods have shown remarkable success in modeling and understanding 3D scenes. Similarly, various 4D representations have demonstrated the ability to capture the dynamics of the 4D world. However, there is a dearth of research focusing on segmentation within 4D representations. In this paper, we propose Segment Any 4D Gaussians (SA4D), one of the first frameworks to segment anything in the 4D digital world based on 4D Gaussians. In SA4D, an efficient temporal identity feature field is introduced to handle Gaussian drifting, with the potential to learn precise identity features from noisy and sparse input. Additionally, a 4D segmentation refinement process is proposed to remove artifacts. Our SA4D achieves precise, high-quality segmentation within seconds in 4D Gaussians and shows the ability to remove, recolor, compose, and render high-quality anything masks. More demos are available at: https://jsxzs.github.io/sa4d/.
