Table of Contents
Fetching ...

On Support Relations Inference and Scene Hierarchy Graph Construction from Point Cloud in Clustered Environments

Gang Ma, Hui Wei

TL;DR

The paper addresses 3D scene understanding in clustered environments by extracting plane primitives from RGBD point clouds, constructing an adjacency graph, and inferring support relations through a bottom-up pipeline that culminates in a hierarchical scene graph. It introduces a combinatorial optimization formulation for primitive classification and a two-stage Local/Global support inference, demonstrated on the OSD and OCID datasets with strong primitive and graph-level performance. The key contributions are (i) a spatial-configuration detector for plane pairs, (ii) a robust primitive classification framework solved via a unary quadratic integer program, and (iii) a scalable, two-level scene hierarchy graph with an invisible root to ensure traversability. This approach complements RGB-based methods by leveraging rich 3D geometry and topology to enable more reliable grasping and task planning in robotic systems.

Abstract

Over the years, scene understanding has attracted a growing interest in computer vision, providing the semantic and physical scene information necessary for robots to complete some particular tasks autonomously. In 3D scenes, rich spatial geometric and topological information are often ignored by RGB-based approaches for scene understanding. In this study, we develop a bottom-up approach for scene understanding that infers support relations between objects from a point cloud. Our approach utilizes the spatial topology information of the plane pairs in the scene, consisting of three major steps. 1) Detection of pairwise spatial configuration: dividing primitive pairs into local support connection and local inner connection; 2) primitive classification: a combinatorial optimization method applied to classify primitives; and 3) support relations inference and hierarchy graph construction: bottom-up support relations inference and scene hierarchy graph construction containing primitive level and object level. Through experiments, we demonstrate that the algorithm achieves excellent performance in primitive classification and support relations inference. Additionally, we show that the scene hierarchy graph contains rich geometric and topological information of objects, and it possesses great scalability for scene understanding.

On Support Relations Inference and Scene Hierarchy Graph Construction from Point Cloud in Clustered Environments

TL;DR

The paper addresses 3D scene understanding in clustered environments by extracting plane primitives from RGBD point clouds, constructing an adjacency graph, and inferring support relations through a bottom-up pipeline that culminates in a hierarchical scene graph. It introduces a combinatorial optimization formulation for primitive classification and a two-stage Local/Global support inference, demonstrated on the OSD and OCID datasets with strong primitive and graph-level performance. The key contributions are (i) a spatial-configuration detector for plane pairs, (ii) a robust primitive classification framework solved via a unary quadratic integer program, and (iii) a scalable, two-level scene hierarchy graph with an invisible root to ensure traversability. This approach complements RGB-based methods by leveraging rich 3D geometry and topology to enable more reliable grasping and task planning in robotic systems.

Abstract

Over the years, scene understanding has attracted a growing interest in computer vision, providing the semantic and physical scene information necessary for robots to complete some particular tasks autonomously. In 3D scenes, rich spatial geometric and topological information are often ignored by RGB-based approaches for scene understanding. In this study, we develop a bottom-up approach for scene understanding that infers support relations between objects from a point cloud. Our approach utilizes the spatial topology information of the plane pairs in the scene, consisting of three major steps. 1) Detection of pairwise spatial configuration: dividing primitive pairs into local support connection and local inner connection; 2) primitive classification: a combinatorial optimization method applied to classify primitives; and 3) support relations inference and hierarchy graph construction: bottom-up support relations inference and scene hierarchy graph construction containing primitive level and object level. Through experiments, we demonstrate that the algorithm achieves excellent performance in primitive classification and support relations inference. Additionally, we show that the scene hierarchy graph contains rich geometric and topological information of objects, and it possesses great scalability for scene understanding.
Paper Structure (13 sections, 13 equations, 12 figures, 5 tables, 1 algorithm)

This paper contains 13 sections, 13 equations, 12 figures, 5 tables, 1 algorithm.

Figures (12)

  • Figure 1: An overview of our approach.
  • Figure 2: Three examples of convex polyhedron shapes.
  • Figure 3: Local support connection and local inner connection. $box_2$ rests on the plane $P_0$ of $box_1$. The spatial configuration of $P_0$ and $P_1$ is considered as a local support relation, while that of $P_1$ and $P_2$ (the planes of $box_2$) is considered as a local inner connection.
  • Figure 4: Definition for $\textbf{Ratio}$, which is used to determine whether two neighoring primitives are watertight. (a) An example of a primitive pair configuration. (b) A planar view of (a). In (b), $q_1q_2$ is the bottom edge of plane $P_{i + 1}$ and rests on plane $P_i$. Point $p_1$ and $p_2$ are intersections between line $q_1q_2$ and the edges of plane $P_i$. In this case, $\textbf{Ratio} = min(\frac{p_1q_2}{p_1p_2}, \frac{p_1q_2}{q_1q_2})$. If $\textbf{Ratio} \ge \tau$, two primitives are watertight; otherwise, they are not. Experimentally, $\tau = 0.86$. $\textbf{Ratio}$ for other pairwise spatial configurations in Figure \ref{['fig:spatialpattern']} is similar to this case.
  • Figure 5: The eight structural patterns for pairs of neighboring fitting plane primitives are used in our structural detection algorithm.
  • ...and 7 more figures