Edge-Centric Relational Reasoning for 3D Scene Graph Prediction
Yanni Ma, Hao Liu, Yulan Guo, Theo Gevers, Martin R. Oswald
TL;DR
The paper tackles 3D scene graph prediction by identifying the limitations of object-centric message passing and introducing LEO, a three-stage framework that transitions from edge-centric to object-centric reasoning. LEO uses a link-prediction module to softly weight edges, transforms the scene graph into a line graph to perform relation-level reasoning with a LineGNN, and then fuses the enriched relation features back into the original graph for final predictions. This edge-to-object approach is model-agnostic and yields consistent improvements on the 3DSSG benchmark when integrated with strong baselines like KISGP and 3DHetSGP, demonstrating the value of higher-order relational context. The results show gains in PredCls and SGCls tasks, indicating enhanced robustness and coherence of 3D scene graphs in indoor environments.
Abstract
3D scene graph prediction aims to abstract complex 3D environments into structured graphs consisting of objects and their pairwise relationships. Existing approaches typically adopt object-centric graph neural networks, where relation edge features are iteratively updated by aggregating messages from connected object nodes. However, this design inherently restricts relation representations to pairwise object context, making it difficult to capture high-order relational dependencies that are essential for accurate relation prediction. To address this limitation, we propose a Link-guided Edge-centric relational reasoning framework with Object-aware fusion, namely LEO, which enables progressive reasoning from relation-level context to object-level understanding. Specifically, LEO first predicts potential links between object pairs to suppress irrelevant edges, and then transforms the original scene graph into a line graph where each relation is treated as a node. A line graph neural network is applied to perform edge-centric relational reasoning to capture inter-relation context. The enriched relation features are subsequently integrated into the original object-centric graph to enhance object-level reasoning and improve relation prediction. Our framework is model-agnostic and can be integrated with any existing object-centric method. Experiments on the 3DSSG dataset with two competitive baselines show consistent improvements, highlighting the effectiveness of our edge-to-object reasoning paradigm.
