Table of Contents
Fetching ...

SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks

Yaxu Xie, Alain Pagani, Didier Stricker

TL;DR

SG-PGM reframes 3D scene graph alignment as partial graph matching and fuses semantic graph embeddings with geometry via a Point to Scene Graph Fusion module. A Sinkhorn-based affinity with differentiable Soft-topK enables explicit one-to-one partial matching, while Super-point Matching Rescoring injects semantic priors into registration, reducing false correspondences in low-overlap scenes. The approach yields significant gains in scene-graph alignment, overlap checking, and downstream point-cloud registration and mosaicking, and demonstrates robustness to scene changes. By reusing strong geometric features from registration backbones and integrating a differentiable matching pipeline, SG-PGM achieves faster, more accurate downstream results and offers a scalable, decoupled framework for 3D scene understanding.

Abstract

Scene graphs have been recently introduced into 3D spatial understanding as a comprehensive representation of the scene. The alignment between 3D scene graphs is the first step of many downstream tasks such as scene graph aided point cloud registration, mosaicking, overlap checking, and robot navigation. In this work, we treat 3D scene graph alignment as a partial graph-matching problem and propose to solve it with a graph neural network. We reuse the geometric features learned by a point cloud registration method and associate the clustered point-level geometric features with the node-level semantic feature via our designed feature fusion module. Partial matching is enabled by using a learnable method to select the top-k similar node pairs. Subsequent downstream tasks such as point cloud registration are achieved by running a pre-trained registration network within the matched regions. We further propose a point-matching rescoring method, that uses the node-wise alignment of the 3D scene graph to reweight the matching candidates from a pre-trained point cloud registration method. It reduces the false point correspondences estimated especially in low-overlapping cases. Experiments show that our method improves the alignment accuracy by 10~20% in low-overlap and random transformation scenarios and outperforms the existing work in multiple downstream tasks.

SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks

TL;DR

SG-PGM reframes 3D scene graph alignment as partial graph matching and fuses semantic graph embeddings with geometry via a Point to Scene Graph Fusion module. A Sinkhorn-based affinity with differentiable Soft-topK enables explicit one-to-one partial matching, while Super-point Matching Rescoring injects semantic priors into registration, reducing false correspondences in low-overlap scenes. The approach yields significant gains in scene-graph alignment, overlap checking, and downstream point-cloud registration and mosaicking, and demonstrates robustness to scene changes. By reusing strong geometric features from registration backbones and integrating a differentiable matching pipeline, SG-PGM achieves faster, more accurate downstream results and offers a scalable, decoupled framework for 3D scene understanding.

Abstract

Scene graphs have been recently introduced into 3D spatial understanding as a comprehensive representation of the scene. The alignment between 3D scene graphs is the first step of many downstream tasks such as scene graph aided point cloud registration, mosaicking, overlap checking, and robot navigation. In this work, we treat 3D scene graph alignment as a partial graph-matching problem and propose to solve it with a graph neural network. We reuse the geometric features learned by a point cloud registration method and associate the clustered point-level geometric features with the node-level semantic feature via our designed feature fusion module. Partial matching is enabled by using a learnable method to select the top-k similar node pairs. Subsequent downstream tasks such as point cloud registration are achieved by running a pre-trained registration network within the matched regions. We further propose a point-matching rescoring method, that uses the node-wise alignment of the 3D scene graph to reweight the matching candidates from a pre-trained point cloud registration method. It reduces the false point correspondences estimated especially in low-overlapping cases. Experiments show that our method improves the alignment accuracy by 10~20% in low-overlap and random transformation scenarios and outperforms the existing work in multiple downstream tasks.
Paper Structure (28 sections, 20 equations, 10 figures, 14 tables)

This paper contains 28 sections, 20 equations, 10 figures, 14 tables.

Figures (10)

  • Figure 1: SG-PGM: partial graph matching for 3D scene graph alignment. Semantic and geometric features are fused for object-wise matching between fragments (a), and downstream tasks such as (b) overlap-check and (c) point cloud registration.
  • Figure 2: The network overview of the proposed system. (a) shows the feature extraction and our proposed Point to Scene Graph Feature Fusion of one single point cloud and its associated 3D scene graph. (b) shows the alignment stage between the source and the reference scene graphs and the registration stage of point clouds with the guidance of our proposed Superpoint Matching Rescoring method. We reuse the pretrained point cloud encoder of the point cloud registration method. Its weights are locked during training.
  • Figure 3: Scene graph encoder with GATv2 layers and learnable skip connections.
  • Figure 4: P2SG fusion module projects point-wise geometric features to node-wise geometric embedding and combines it with the semantic scene graph feature.
  • Figure 5: Long-range cross-object geometric feature is gathered in registration method qin2022geometric with transformer. Points in red circles are difficult to match without taking nearby objects as a reference.
  • ...and 5 more figures