DF-SLAM: Dictionary Factors Representation for High-Fidelity Neural Implicit Dense Visual SLAM System
Weifeng Wei, Jie Wang, Shuqi Deng, Jie Liu
TL;DR
DF-SLAM tackles high-fidelity dense visual SLAM by representing the scene with dictionary factors: geometry basis grids $B_g^l$ and $B_a^l$ and coefficient grids $C_g$, $C_a$, whose per-point features feed decoders to produce $s(x_i)$ and color. It adds feature integration rendering where ray features are aggregated as $f_a(r)=\sum w_i f_a(x_i)$ and colored via a shallow MLP, achieving faster color rendering without sacrificing quality. The method jointly optimizes geometry and appearance through a composite loss $\mathcal{L}=\lambda_c\mathcal{L}_c+\lambda_d\mathcal{L}_d+\lambda_{fs}\mathcal{L}_{fs}+\lambda_{sdf}\mathcal{L}_{sdf}$, with tracking and mapping performed over a frame window and distinct initialization and update rules. Extensive experiments on Replica, ScanNet, and TUM-RGBD show competitive real-time performance, detailed reconstructions, and robust camera localization, with ablations confirming the value of the dictionary-factor design and feature integration rendering. The code is released to facilitate public benchmarking and reuse.
Abstract
We introduce a high-fidelity neural implicit dense visual Simultaneous Localization and Mapping (SLAM) system, termed DF-SLAM. In our work, we employ dictionary factors for scene representation, encoding the geometry and appearance information of the scene as a combination of basis and coefficient factors. Compared to neural implicit dense visual SLAM methods that directly encode scene information as features, our method exhibits superior scene detail reconstruction capabilities and more efficient memory usage, while our model size is insensitive to the size of the scene map, making our method more suitable for large-scale scenes. Additionally, we employ feature integration rendering to accelerate color rendering speed while ensuring color rendering quality, further enhancing the real-time performance of our neural SLAM method. Extensive experiments on synthetic and real-world datasets demonstrate that our method is competitive with existing state-of-the-art neural implicit SLAM methods in terms of real-time performance, localization accuracy, and scene reconstruction quality. Our source code is available at https://github.com/funcdecl/DF-SLAM.
