A Review and Efficient Implementation of Scene Graph Generation Metrics
Julian Lorenz, Robin Schön, Katja Ludwig, Rainer Lienhart
TL;DR
The paper addresses the lack of precise, formal definitions for scene graph generation metrics by providing a rigorous metric framework and accompanying pseudocode. It introduces SGBench, a lightweight, dependency-minimal Python package that implements all defined metrics, alongside a public benchmarking service to compare PSGG methods on a central platform. Through exhaustive experiments on panoptic scene graph methods, the authors demonstrate clearer, more reproducible evaluations and show how standardized metrics can illuminate method strengths, limitations, and trade-offs. This work enables reproducible benchmarking, accelerates method development, and promotes visibility of new PSGG approaches in a centralized, accessible manner.
Abstract
Scene graph generation has emerged as a prominent research field in computer vision, witnessing significant advancements in the recent years. However, despite these strides, precise and thorough definitions for the metrics used to evaluate scene graph generation models are lacking. In this paper, we address this gap in the literature by providing a review and precise definition of commonly used metrics in scene graph generation. Our comprehensive examination clarifies the underlying principles of these metrics and can serve as a reference or introduction to scene graph metrics. Furthermore, to facilitate the usage of these metrics, we introduce a standalone Python package called SGBench that efficiently implements all defined metrics, ensuring their accessibility to the research community. Additionally, we present a scene graph benchmarking web service, that enables researchers to compare scene graph generation methods and increase visibility of new methods in a central place. All of our code can be found at https://lorjul.github.io/sgbench/.
