Refining Segmentation On-the-Fly: An Interactive Framework for Point Cloud Semantic Segmentation
Peng Zhang, Ting Wu, Jinsheng Sun, Weiqing Li, Zhiyong Su
TL;DR
This work addresses the challenge of semantic segmentation for entire point-cloud scenes under user guidance by introducing InterPCSeg, an interactive framework that operates on-the-fly with off-the-shelf networks. It treats user corrections as sparse test-time supervision and adds a stabilization energy to ensure stable refinement, while a novel interaction simulator enables objective evaluation. The method combines BN warm-up, a correction energy and a stabilization energy in a test-time loss to refine segmentation with few clicks, and re-infers to produce improved labels. Empirical results on S3DIS and ScanNet show substantial mIoU gains with modest interaction budgets, demonstrating practical impact for rapid scene annotation without offline re-training.
Abstract
Existing interactive point cloud segmentation approaches primarily focus on the object segmentation, which aim to determine which points belong to the object of interest guided by user interactions. This paper concentrates on an unexplored yet meaningful task, i.e., interactive point cloud semantic segmentation, which assigns high-quality semantic labels to all points in a scene with user corrective clicks. Concretely, we presents the first interactive framework for point cloud semantic segmentation, named InterPCSeg, which seamlessly integrates with off-the-shelf semantic segmentation networks without offline re-training, enabling it to run in an on-the-fly manner. To achieve online refinement, we treat user interactions as sparse training examples during the test-time. To address the instability caused by the sparse supervision, we design a stabilization energy to regulate the test-time training process. For objective and reproducible evaluation, we develop an interaction simulation scheme tailored for the interactive point cloud semantic segmentation task. We evaluate our framework on the S3DIS and ScanNet datasets with off-the-shelf segmentation networks, incorporating interactions from both the proposed interaction simulator and real users. Quantitative and qualitative experimental results demonstrate the efficacy of our framework in refining the semantic segmentation results with user interactions. The source code will be publicly available.
