Combinatorial Complex Score-based Diffusion Modelling through Stochastic Differential Equations
Adrien Carrel
TL;DR
This work tackles the challenge of generative modeling for complex topologies by introducing Combinatorial Complex Score-based Diffusion (CCSD), a unified framework that treats generation through stochastic differential equations to produce combinatorial complexes (CCs) rather than only graphs. CCSD extends score-based diffusion to higher-order topologies via lifting procedures (loop-based and path-based) that map graphs into CCs, enabling the generation of hypergraphs and simplicial-like structures while preserving higher-order relations. The framework defines forward diffusion on CC components, learns partial score functions with dedicated neural architectures, and uses reverse-time SDEs or probability-flow ODEs for sampling, including conditional sampling and imputation. The authors provide a theoretical basis for CC representations (Dimension-Constrained CCs and FCCs), present novel CC-specific score networks and Hodge-based metrics, and deliver a Python library to train and sample CCs. Empirically, CCSD achieves competitive performance on molecule and graph generation tasks and demonstrates promising capabilities for higher-order objects, supported by a public software package and extensive evaluation metrics tailored to CCs.
Abstract
Graph structures offer a versatile framework for representing diverse patterns in nature and complex systems, applicable across domains like molecular chemistry, social networks, and transportation systems. While diffusion models have excelled in generating various objects, generating graphs remains challenging. This thesis explores the potential of score-based generative models in generating such objects through a modelization as combinatorial complexes, which are powerful topological structures that encompass higher-order relationships. In this thesis, we propose a unified framework by employing stochastic differential equations. We not only generalize the generation of complex objects such as graphs and hypergraphs, but we also unify existing generative modelling approaches such as Score Matching with Langevin dynamics and Denoising Diffusion Probabilistic Models. This innovation overcomes limitations in existing frameworks that focus solely on graph generation, opening up new possibilities in generative AI. The experiment results showed that our framework could generate these complex objects, and could also compete against state-of-the-art approaches for mere graph and molecule generation tasks.
