Generative Modelling of Structurally Constrained Graphs
Manuel Madeira, Clement Vignac, Dorina Thanou, Pascal Frossard
TL;DR
ConStruct addresses the challenge of generating graphs that satisfy hard, domain-specific structural constraints by introducing a constrained graph discrete diffusion framework. It combines an edge-absorbing forward noise model with a property-preserving projector to ensure that both forward and reverse diffusion steps stay within a constrained graph class defined by edge-deletion invariants (e.g., planarity, acyclicity). The method demonstrates strong performance across synthetic datasets and digital pathology graphs, achieving near-perfect constraint validity and substantial gains in data plausibility (e.g., up to 71.1 percentage points in TLS-bearing cell graphs). Efficiency improvements via an edge-blocking hash table and incremental constraint checks keep sampling overhead modest, making constrained diffusion practical for real-world applications such as biomedical graph data augmentation and molecular design.
Abstract
Graph diffusion models have emerged as state-of-the-art techniques in graph generation; yet, integrating domain knowledge into these models remains challenging. Domain knowledge is particularly important in real-world scenarios, where invalid generated graphs hinder deployment in practical applications. Unconstrained and conditioned graph diffusion models fail to guarantee such domain-specific structural properties. We present ConStruct, a novel framework that enables graph diffusion models to incorporate hard constraints on specific properties, such as planarity or acyclicity. Our approach ensures that the sampled graphs remain within the domain of graphs that satisfy the specified property throughout the entire trajectory in both the forward and reverse processes. This is achieved by introducing an edge-absorbing noise model and a new projector operator. ConStruct demonstrates versatility across several structural and edge-deletion invariant constraints and achieves state-of-the-art performance for both synthetic benchmarks and attributed real-world datasets. For example, by incorporating planarity constraints in digital pathology graph datasets, the proposed method outperforms existing baselines, improving data validity by up to 71.1 percentage points.
