On Support Relations Inference and Scene Hierarchy Graph Construction from Point Cloud in Clustered Environments
Gang Ma, Hui Wei
TL;DR
The paper addresses 3D scene understanding in clustered environments by extracting plane primitives from RGBD point clouds, constructing an adjacency graph, and inferring support relations through a bottom-up pipeline that culminates in a hierarchical scene graph. It introduces a combinatorial optimization formulation for primitive classification and a two-stage Local/Global support inference, demonstrated on the OSD and OCID datasets with strong primitive and graph-level performance. The key contributions are (i) a spatial-configuration detector for plane pairs, (ii) a robust primitive classification framework solved via a unary quadratic integer program, and (iii) a scalable, two-level scene hierarchy graph with an invisible root to ensure traversability. This approach complements RGB-based methods by leveraging rich 3D geometry and topology to enable more reliable grasping and task planning in robotic systems.
Abstract
Over the years, scene understanding has attracted a growing interest in computer vision, providing the semantic and physical scene information necessary for robots to complete some particular tasks autonomously. In 3D scenes, rich spatial geometric and topological information are often ignored by RGB-based approaches for scene understanding. In this study, we develop a bottom-up approach for scene understanding that infers support relations between objects from a point cloud. Our approach utilizes the spatial topology information of the plane pairs in the scene, consisting of three major steps. 1) Detection of pairwise spatial configuration: dividing primitive pairs into local support connection and local inner connection; 2) primitive classification: a combinatorial optimization method applied to classify primitives; and 3) support relations inference and hierarchy graph construction: bottom-up support relations inference and scene hierarchy graph construction containing primitive level and object level. Through experiments, we demonstrate that the algorithm achieves excellent performance in primitive classification and support relations inference. Additionally, we show that the scene hierarchy graph contains rich geometric and topological information of objects, and it possesses great scalability for scene understanding.
