RelationField: Relate Anything in Radiance Fields
Sebastian Koch, Johanna Wald, Mirco Colosi, Narunas Vaskevicius, Pedro Hermosilla, Federico Tombari, Timo Ropinski
TL;DR
RelationField introduces a first open-vocabulary relational reasoning framework for neural radiance fields by adding a dedicated relationship field and distilling inter-object knowledge from multimodal LLMs. It enables open-vocabulary object and relationship queries, producing state-of-the-art 3D scene graphs and a new task of relationship-guided 3D instance segmentation. The approach combines two-step querying, SoM prompting, and cross-modal supervision to render relationship features aligned with 3D geometry. This work demonstrates that 3D-consistent relational reasoning in radiance fields yields tangible gains over 2D-only inferences and opens avenues for richer, text-driven scene understanding without explicit 3D meshes.
Abstract
Neural radiance fields are an emerging 3D scene representation and recently even been extended to learn features for scene understanding by distilling open-vocabulary features from vision-language models. However, current method primarily focus on object-centric representations, supporting object segmentation or detection, while understanding semantic relationships between objects remains largely unexplored. To address this gap, we propose RelationField, the first method to extract inter-object relationships directly from neural radiance fields. RelationField represents relationships between objects as pairs of rays within a neural radiance field, effectively extending its formulation to include implicit relationship queries. To teach RelationField complex, open-vocabulary relationships, relationship knowledge is distilled from multi-modal LLMs. To evaluate RelationField, we solve open-vocabulary 3D scene graph generation tasks and relationship-guided instance segmentation, achieving state-of-the-art performance in both tasks. See the project website at https://relationfield.github.io.
