Sketchy Bounding-box Supervision for 3D Instance Segmentation
Qian Deng, Le Hui, Jin Xie, Jian Yang
TL;DR
This work tackles weakly supervised 3D instance segmentation with imperfect sketchy bounding boxes. It introduces Sketchy-3DIS, a framework that jointly learns an adaptive box-to-point pseudo labeler and a coarse-to-fine instance segmentator, enabling conversion of noisy box annotations into high-quality pseudo labels and refined instance masks. The method uses bilateral Hungarian matching to align pseudo-ground-truth with predicted instances and employs multi-level attention to progressively refine segmentation. Experiments on ScanNetV2 and S3DIS demonstrate state-of-the-art performance under sketchy bounding boxes and even surpass some fully supervised baselines, highlighting the practical viability of annotation-efficient 3D scene understanding.
Abstract
Bounding box supervision has gained considerable attention in weakly supervised 3D instance segmentation. While this approach alleviates the need for extensive point-level annotations, obtaining accurate bounding boxes in practical applications remains challenging. To this end, we explore the inaccurate bounding box, named sketchy bounding box, which is imitated through perturbing ground truth bounding box by adding scaling, translation, and rotation. In this paper, we propose Sketchy-3DIS, a novel weakly 3D instance segmentation framework, which jointly learns pseudo labeler and segmentator to improve the performance under the sketchy bounding-box supervisions. Specifically, we first propose an adaptive box-to-point pseudo labeler that adaptively learns to assign points located in the overlapped parts between two sketchy bounding boxes to the correct instance, resulting in compact and pure pseudo instance labels. Then, we present a coarse-to-fine instance segmentator that first predicts coarse instances from the entire point cloud and then learns fine instances based on the region of coarse instances. Finally, by using the pseudo instance labels to supervise the instance segmentator, we can gradually generate high-quality instances through joint training. Extensive experiments show that our method achieves state-of-the-art performance on both the ScanNetV2 and S3DIS benchmarks, and even outperforms several fully supervised methods using sketchy bounding boxes. Code is available at https://github.com/dengq7/Sketchy-3DIS.
