SANeRF-HQ: Segment Anything for NeRF in High Quality
Yichen Liu, Benran Hu, Chi-Keung Tang, Yu-Wing Tai
TL;DR
The paper tackles open-world 3D object segmentation within Neural Radiance Fields (NeRF) by leveraging Segment Anything Model (SAM) for promptable 2D masks and NeRF for cross-view information fusion. It introduces SANeRF-HQ, a three-component pipeline consisting of a feature container (cache or distillation), a mask decoder, and a mask aggregator that builds a 3D object field while enforcing high-quality boundaries and multi-view consistency. A key innovation is the Ray-Pair RGB loss, which aligns color-based ray similarities with segmentation predictions using error-guided local sampling to refine boundaries. Across multiple NeRF datasets, SANeRF-HQ demonstrates superior segmentation quality and robustness compared to prior zero-shot and auto-segmentation approaches, with practical implications for interactive 3D scene understanding and potential extensions to dynamic 4D scenes.
Abstract
Recently, the Segment Anything Model (SAM) has showcased remarkable capabilities of zero-shot segmentation, while NeRF (Neural Radiance Fields) has gained popularity as a method for various 3D problems beyond novel view synthesis. Though there exist initial attempts to incorporate these two methods into 3D segmentation, they face the challenge of accurately and consistently segmenting objects in complex scenarios. In this paper, we introduce the Segment Anything for NeRF in High Quality (SANeRF-HQ) to achieve high-quality 3D segmentation of any target object in a given scene. SANeRF-HQ utilizes SAM for open-world object segmentation guided by user-supplied prompts, while leveraging NeRF to aggregate information from different viewpoints. To overcome the aforementioned challenges, we employ density field and RGB similarity to enhance the accuracy of segmentation boundary during the aggregation. Emphasizing on segmentation accuracy, we evaluate our method on multiple NeRF datasets where high-quality ground-truths are available or manually annotated. SANeRF-HQ shows a significant quality improvement over state-of-the-art methods in NeRF object segmentation, provides higher flexibility for object localization, and enables more consistent object segmentation across multiple views. Results and code are available at the project site: https://lyclyc52.github.io/SANeRF-HQ/.
