Conformal Prediction and MLLM aided Uncertainty Quantification in Scene Graph Generation
Sayak Nag, Udita Ghosh, Calvin-Khang Ta, Sarosij Bose, Jiachen Li, Amit K Roy Chowdhury
TL;DR
This work introduces PC-SGG, a post-hoc, model-agnostic Conformal Prediction framework for uncertainty quantification in Scene Graph Generation. It builds class-conditional prediction sets for objects and predicates, combines them into triplet sets with formal coverage guarantees, and further refines these sets using an MLLM-based plausibility filter via MCQA prompts and in-context learning. Empirical evaluation on VG150 across five SGG backbones shows that PC-SGG provides calibrated uncertainty estimates, improves tail-class recall, and substantially reduces set sizes with minimal impact on overall coverage. The combination enables generation of diverse, plausible scene graphs with safety guarantees suitable for downstream tasks, including robotics and multimodal reasoning.
Abstract
Scene Graph Generation (SGG) aims to represent visual scenes by identifying objects and their pairwise relationships, providing a structured understanding of image content. However, inherent challenges like long-tailed class distributions and prediction variability necessitate uncertainty quantification in SGG for its practical viability. In this paper, we introduce a novel Conformal Prediction (CP) based framework, adaptive to any existing SGG method, for quantifying their predictive uncertainty by constructing well-calibrated prediction sets over their generated scene graphs. These scene graph prediction sets are designed to achieve statistically rigorous coverage guarantees. Additionally, to ensure these prediction sets contain the most practically interpretable scene graphs, we design an effective MLLM-based post-processing strategy for selecting the most visually and semantically plausible scene graphs within these prediction sets. We show that our proposed approach can produce diverse possible scene graphs from an image, assess the reliability of SGG methods, and improve overall SGG performance.
