PolyGraph Discrepancy: a classifier-based metric for graph generation

Markus Krimmel; Philip Hartout; Karsten Borgwardt; Dexiong Chen

PolyGraph Discrepancy: a classifier-based metric for graph generation

Markus Krimmel, Philip Hartout, Karsten Borgwardt, Dexiong Chen

TL;DR

The paper addresses the lack of absolute, cross-descriptor evaluation in graph-generative modeling by criticizing MMD-based metrics. It proposes PolyGraph Discrepancy (PGD), a classifier-based approach that approximates a variational lower bound on the Jensen-Shannon distance between real and generated graphs, yielding unit-scale scores. PGD supports single-descriptor estimation and multi-descriptor aggregation with a principled descriptor-selection step, using TabPFN as a fast discriminative model. Extensive experiments show PGD tracks perturbations, correlates with model quality, and provides robust benchmarks, accompanied by an open-source PolyGraph library for standardized evaluation.

Abstract

Existing methods for evaluating graph generative models primarily rely on Maximum Mean Discrepancy (MMD) metrics based on graph descriptors. While these metrics can rank generative models, they do not provide an absolute measure of performance. Their values are also highly sensitive to extrinsic parameters, namely kernel and descriptor parametrization, making them incomparable across different graph descriptors. We introduce PolyGraph Discrepancy (PGD), a new evaluation framework that addresses these limitations. It approximates the Jensen-Shannon distance of graph distributions by fitting binary classifiers to distinguish between real and generated graphs, featurized by these descriptors. The data log-likelihood of these classifiers approximates a variational lower bound on the JS distance between the two distributions. Resulting metrics are constrained to the unit interval [0,1] and are comparable across different graph descriptors. We further derive a theoretically grounded summary metric that combines these individual metrics to provide a maximally tight lower bound on the distance for the given descriptors. Thorough experiments demonstrate that PGD provides a more robust and insightful evaluation compared to MMD metrics. The PolyGraph framework for benchmarking graph generative models is made publicly available at https://github.com/BorgwardtLab/polygraph-benchmark.

PolyGraph Discrepancy: a classifier-based metric for graph generation

TL;DR

Abstract

PolyGraph Discrepancy: a classifier-based metric for graph generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (27)