NEDS-SLAM: A Neural Explicit Dense Semantic SLAM Framework using 3D Gaussian Splatting
Yiming Ji, Yang Liu, Guanghu Xie, Boyu Ma, Zongwu Xie
TL;DR
NEDS-SLAM addresses robust dense semantic SLAM by embedding high-dimensional semantic features into 3D Gaussian representations and using differentiable Gaussian splatting for real-time rendering. It introduces SCFF to fuse semantic and appearance cues with spatial consistency, a lightweight encoder to compress semantic features into Gaussian parameters, and Virtual Camera View Pruning (VCVP) to identify and attenuate outlier Gaussians from novel virtual views. The system demonstrates improved camera tracking accuracy and semantically rich reconstructions on Replica and ScanNet datasets, outperforming several baselines and showing strong ablations for SCFF and VCVP. This work advances practical neural implicit SLAM by balancing expressive semantic embedding, memory efficiency, and real-time performance for dense semantic scene understanding.
Abstract
We propose NEDS-SLAM, a dense semantic SLAM system based on 3D Gaussian representation, that enables robust 3D semantic mapping, accurate camera tracking, and high-quality rendering in real-time. In the system, we propose a Spatially Consistent Feature Fusion model to reduce the effect of erroneous estimates from pre-trained segmentation head on semantic reconstruction, achieving robust 3D semantic Gaussian mapping. Additionally, we employ a lightweight encoder-decoder to compress the high-dimensional semantic features into a compact 3D Gaussian representation, mitigating the burden of excessive memory consumption. Furthermore, we leverage the advantage of 3D Gaussian splatting, which enables efficient and differentiable novel view rendering, and propose a Virtual Camera View Pruning method to eliminate outlier gaussians, thereby effectively enhancing the quality of scene representations. Our NEDS-SLAM method demonstrates competitive performance over existing dense semantic SLAM methods in terms of mapping and tracking accuracy on Replica and ScanNet datasets, while also showing excellent capabilities in 3D dense semantic mapping.
