Table of Contents
Fetching ...

Part2Object: Hierarchical Unsupervised 3D Instance Segmentation

Cheng Shi, Yulin Zhang, Bin Yang, Jiajin Tang, Yuexin Ma, Sibei Yang

TL;DR

This work tackles unsupervised 3D instance segmentation in indoor scenes by introducing Part2Object, a hierarchical clustering method that discovers object parts and objects across multiple granularities while leveraging 3D objectness priors from temporally adjacent 2D frames. It couples Part2Object with Hi-Mask3D, an enhanced 3D instance segmentation model that learns from pseudo-labels of objects and parts to produce accurate 3D masks in a self-training loop. The approach yields state-of-the-art results in training-free, data-efficient, and cross-dataset settings, demonstrating strong generalization across ScanNet, S3DIS, and Replica. Overall, Part2Object and Hi-Mask3D offer a scalable framework for hierarchical 3D object understanding without manual annotations, with practical impact on autonomous navigation, robotics, and immersive reality applications.

Abstract

Unsupervised 3D instance segmentation aims to segment objects from a 3D point cloud without any annotations. Existing methods face the challenge of either too loose or too tight clustering, leading to under-segmentation or over-segmentation. To address this issue, we propose Part2Object, hierarchical clustering with object guidance. Part2Object employs multi-layer clustering from points to object parts and objects, allowing objects to manifest at any layer. Additionally, it extracts and utilizes 3D objectness priors from temporally consecutive 2D RGB frames to guide the clustering process. Moreover, we propose Hi-Mask3D to support hierarchical 3D object part and instance segmentation. By training Hi-Mask3D on the objects and object parts extracted from Part2Object, we achieve consistent and superior performance compared to state-of-the-art models in various settings, including unsupervised instance segmentation, data-efficient fine-tuning, and cross-dataset generalization. Code is release at https://github.com/ChengShiest/Part2Object

Part2Object: Hierarchical Unsupervised 3D Instance Segmentation

TL;DR

This work tackles unsupervised 3D instance segmentation in indoor scenes by introducing Part2Object, a hierarchical clustering method that discovers object parts and objects across multiple granularities while leveraging 3D objectness priors from temporally adjacent 2D frames. It couples Part2Object with Hi-Mask3D, an enhanced 3D instance segmentation model that learns from pseudo-labels of objects and parts to produce accurate 3D masks in a self-training loop. The approach yields state-of-the-art results in training-free, data-efficient, and cross-dataset settings, demonstrating strong generalization across ScanNet, S3DIS, and Replica. Overall, Part2Object and Hi-Mask3D offer a scalable framework for hierarchical 3D object understanding without manual annotations, with practical impact on autonomous navigation, robotics, and immersive reality applications.

Abstract

Unsupervised 3D instance segmentation aims to segment objects from a 3D point cloud without any annotations. Existing methods face the challenge of either too loose or too tight clustering, leading to under-segmentation or over-segmentation. To address this issue, we propose Part2Object, hierarchical clustering with object guidance. Part2Object employs multi-layer clustering from points to object parts and objects, allowing objects to manifest at any layer. Additionally, it extracts and utilizes 3D objectness priors from temporally consecutive 2D RGB frames to guide the clustering process. Moreover, we propose Hi-Mask3D to support hierarchical 3D object part and instance segmentation. By training Hi-Mask3D on the objects and object parts extracted from Part2Object, we achieve consistent and superior performance compared to state-of-the-art models in various settings, including unsupervised instance segmentation, data-efficient fine-tuning, and cross-dataset generalization. Code is release at https://github.com/ChengShiest/Part2Object
Paper Structure (11 sections, 6 equations, 3 figures, 8 tables)

This paper contains 11 sections, 6 equations, 3 figures, 8 tables.

Figures (3)

  • Figure 1: Motivation of our hierarchical clustering. Single-level clustering results in a trade-off between under-segmentation for certain objects and over-segmentation for others. In contrast, our hierarchical clustering allows for gathering and identifying objects at varying levels of clustering granularity.
  • Figure 2: Overview of our Part2Object hierarchical clustering and Hi-Mask3D instance segmentation framework. Part2Object extracts 3D objectness priors from consecutive 2D RGB frames and uses them to guide hierarchical clustering from points to object parts and objects. Hi-Mask3D utilizes objects and parts identified by Part2Object as pseudo-labels, learning for improving instance segmentation through the utilization of object parts.
  • Figure 3: Robustness of our cluster features and 3D objectness priors. (a) Visualization of the first 3 PCA components and our computed weights of points in Equ \ref{['equ:update']}. (b) Visualization of 2D DINO features' PCA components, 3D points' normal vectors, and our 3D object priors. The comparison between (c) the projection-first-then-grouping pipeline and (d) our grouping-first-then-projection pipeline.