Introspection in Learned Semantic Scene Graph Localisation
Manshika Charvi Bissessur, Efimia Panagiotaki, Daniele De Martini
TL;DR
This paper investigates how semantic cues influence localisation robustness by training a semantics-only, graph-based localisation model and performing thorough post-hoc introspection. It demonstrates that Integrated Gradients and Attention Weights provide reliable attributions for object-class importance, revealing a TF-IDF-like down-weighting of frequent classes and a bias toward distinctive landmarks. The methodology combines a GNN backbone with perturbation-based class-importance analyses and fidelity tests to yield explainable registration under challenging conditions. The findings highlight that semantically salient relations, rather than mere geometry or frequency, drive robust localisation, with practical implications for safety-critical robotics and interpretable SLAM systems. The work also outlines limitations of a purely semantic setup and suggests future integration of geometric cues and evaluation on diverse datasets to strengthen explanations in real-world deployments.
Abstract
This work investigates how semantics influence localisation performance and robustness in a learned self-supervised, contrastive semantic localisation framework. After training a localisation network on both original and perturbed maps, we conduct a thorough post-hoc introspection analysis to probe whether the model filters environmental noise and prioritises distinctive landmarks over routine clutter. We validate various interpretability methods and present a comparative reliability analysis. Integrated gradients and Attention Weights consistently emerge as the most reliable probes of learned behaviour. A semantic class ablation further reveals an implicit weighting in which frequent objects are often down-weighted. Overall, the results indicate that the model learns noise-robust, semantically salient relations about place definition, thereby enabling explainable registration under challenging visual and structural variations.
