Semantically-aware Neural Radiance Fields for Visual Scene Understanding: A Comprehensive Review
Thang-Anh-Quan Nguyen, Amine Bourki, Mátyás Macudzinski, Anthony Brunel, Mohammed Bennamoun
TL;DR
This survey presents the first comprehensive, taxonomy-driven review of semantically-aware Neural Radiance Fields (SRRFs), synthesizing insights from over 250 papers to show how semantic information enhances 3D reconstruction, segmentation, editing, and language-guided interactions. It details core NeRF fundamentals (radiance fields, volumetric rendering, depth, and positional encoding) and then maps a diverse set of SRRF approaches into six categories: 3D geometry enhancement, segmentation, editable NeRFs, object detection/pose estimation, holistic decomposition, and language-enabled SRRFs. The authors discuss datasets, evaluation metrics, and practical challenges such as generalization, data efficiency, and real-time performance, and they propose directions for future work including cross-dataset generalization, multi-modal integration, and collaborative tooling. By clarifying how semantic cues can be incorporated into neural radiance fields, the paper highlights SRRFs as a promising path toward robust, interactive, and semantically grounded 3D scene understanding with broad impact on AR/VR, robotics, and beyond.
Abstract
This review thoroughly examines the role of semantically-aware Neural Radiance Fields (NeRFs) in visual scene understanding, covering an analysis of over 250 scholarly papers. It explores how NeRFs adeptly infer 3D representations for both stationary and dynamic objects in a scene. This capability is pivotal for generating high-quality new viewpoints, completing missing scene details (inpainting), conducting comprehensive scene segmentation (panoptic segmentation), predicting 3D bounding boxes, editing 3D scenes, and extracting object-centric 3D models. A significant aspect of this study is the application of semantic labels as viewpoint-invariant functions, which effectively map spatial coordinates to a spectrum of semantic labels, thus facilitating the recognition of distinct objects within the scene. Overall, this survey highlights the progression and diverse applications of semantically-aware neural radiance fields in the context of visual scene interpretation.
