LISNeRF Mapping: LiDAR-based Implicit Mapping via Semantic Neural Fields for Large-Scale 3D Scenes

Jianyuan Zhang; Zhiliu Yang; Meng Zhang

LISNeRF Mapping: LiDAR-based Implicit Mapping via Semantic Neural Fields for Large-Scale 3D Scenes

Jianyuan Zhang, Zhiliu Yang, Meng Zhang

TL;DR

This paper proposes a novel method for large-scale 3D semantic reconstruction through implicit representations from posed LiDAR measurements alone that first leverage an octree-based and hierarchical structure to store implicit features, then these implicit features are decoded to semantic information and signed distance value through shallow Multilayer Perceptrons (MLPs).

Abstract

Large-scale semantic mapping is crucial for outdoor autonomous agents to fulfill high-level tasks such as planning and navigation. This paper proposes a novel method for large-scale 3D semantic reconstruction through implicit representations from posed LiDAR measurements alone. We first leverage an octree-based and hierarchical structure to store implicit features, then these implicit features are decoded to semantic information and signed distance value through shallow Multilayer Perceptrons (MLPs). We adopt off-the-shelf algorithms to predict the semantic labels and instance IDs of point clouds. We then jointly optimize the feature embeddings and MLPs parameters with a self-supervision paradigm for point cloud geometry and a pseudo-supervision paradigm for semantic and panoptic labels. Subsequently, categories and geometric structures for novel points are regressed, and marching cubes are exploited to subdivide and visualize the scenes in the inferring stage. For scenarios with memory constraints, a map stitching strategy is also developed to merge sub-maps into a complete map. Experiments on two real-world datasets, SemanticKITTI and SemanticPOSS, demonstrate the superior segmentation efficiency and mapping effectiveness of our framework compared to current state-of-the-art 3D LiDAR mapping methods.

LISNeRF Mapping: LiDAR-based Implicit Mapping via Semantic Neural Fields for Large-Scale 3D Scenes

TL;DR

Abstract

Paper Structure (16 sections, 8 equations, 7 figures, 3 tables)

This paper contains 16 sections, 8 equations, 7 figures, 3 tables.

INTRODUCTION
RELATED WORK
METHODOLOGY
Implicit Semantic Mapping
Octree-based Grids
Geometry Features Construction
Semantic Features Construction
Training and Loss Function
Map Merge Strategy
EXPERIMENTS
Experimental Setup
Map Quality Evaluation
Results and Analysis of Map Merge
Evaluation for Dynamic Scenario
Comparison between Two Label Strategies
...and 1 more sections

Figures (7)

Figure 1: Our LISNeRF framework implicitly represents a semantic scene with LiDAR-only input. Different mapping results are shown above via evaluating on sequence 00 of SemanticKITTI. (a) is the raw LiDAR point cloud for input. (b) is the neural scene representation using an implicit method called SHINE_Mapping zhong2023shine. (c) and (d) are the semantic and panoptic reconstruction results via our implicit mapping.
Figure 2: The overview of our LISNeRF framework. We first construct the Geometry Neural Fields (GNF), Semantic Neural Fields (SNF) and Panoptic Neural Fields (PNF) with the learnable feature embeddings of octree and MLPs in the training procedure. To be noticed, we filter out sampled points falling in free space and only use surface points for the training of SNF and PNF. We then calculate the current map size $M$ and predict the semantic labels of cube vertices in the inferring procedure. Given any arbitrary cube corner point in the map, we query the SDF value through GNF. After obtaining the vertices using the Marching Cubes algorithm, we exploit SNF and PNF to query the their semantic labels. Specifically, semantic mapping obtains the semantic label using the entire feature embedding through SNF. Panoptic mapping utilizes feature embeddings with a certain proportion of length to regress the classes of 'thing and stuff' through PNF, and leverages other part of feature embeddings to obtain instance ID of the 'thing' classes through PNF.
Figure 4: The visualization of our 3D semantic reconstruction results on SemanticKITTI via the comparison with other related methods ("GT"= Ground Truth). The mapping results in first row are tested on SemanticKitti 00 sequence, and the second row is tested on SemanticKitti 05 sequence. Our map is denser compared to Suma++, more precise and complete in terms of semantics compared to LODE. Best viewed with zoom in.
Figure 5: The visualization of our semantic mapping results (The first row and the third row) and panoptic mapping results (The second row and the fourth row) on different sequences of different datasets. "SK-00" means sequence 00 of the SemanticKITTI dataset, "SP-00" means sequence 00 of the SemanticPOSS dataset, so on and so forth.
Figure 6: The visualization of sub-maps and the whole map
...and 2 more figures

LISNeRF Mapping: LiDAR-based Implicit Mapping via Semantic Neural Fields for Large-Scale 3D Scenes

TL;DR

Abstract

LISNeRF Mapping: LiDAR-based Implicit Mapping via Semantic Neural Fields for Large-Scale 3D Scenes

Authors

TL;DR

Abstract

Table of Contents

Figures (7)