PDF: A Probability-Driven Framework for Open World 3D Point Cloud Semantic Segmentation
Jinfeng Xu, Siyuan Yang, Xianzhi Li, Yuan Tang, Yixue Hao, Long Hu, Min Chen
TL;DR
The paper tackles open-world semantic segmentation for 3D point clouds by introducing the Probability-Driven Framework (PDF), which simultaneously identifies unknown objects and enables continual learning. PDF combines a lightweight U-decoder to estimate uncertainties (O_U) with a semantic decoder (O_S), uses a pseudo-labeling scheme to create pseudo GT for unknowns, and employs incremental knowledge distillation to integrate novel classes without retraining from scratch. Key contributions include the OSS module with uncertainty-aware supervision, the HUA and 3D graph boundary detection pipeline for refining unknown regions, and the IL strategy that distills knowledge from a teacher model to a new open-world model. Experiments on S3DIS and ScanNetv2 show that PDF substantially improves unknown-object identification and achieves strong incremental learning performance, outperforming prior OWSS methods. The work advances practical open-world perception for 3D scenes with implications for robotics and autonomous systems.
Abstract
Existing point cloud semantic segmentation networks cannot identify unknown classes and update their knowledge, due to a closed-set and static perspective of the real world, which would induce the intelligent agent to make bad decisions. To address this problem, we propose a Probability-Driven Framework (PDF) for open world semantic segmentation that includes (i) a lightweight U-decoder branch to identify unknown classes by estimating the uncertainties, (ii) a flexible pseudo-labeling scheme to supply geometry features along with probability distribution features of unknown classes by generating pseudo labels, and (iii) an incremental knowledge distillation strategy to incorporate novel classes into the existing knowledge base gradually. Our framework enables the model to behave like human beings, which could recognize unknown objects and incrementally learn them with the corresponding knowledge. Experimental results on the S3DIS and ScanNetv2 datasets demonstrate that the proposed PDF outperforms other methods by a large margin in both important tasks of open world semantic segmentation.
