Table of Contents
Fetching ...

Bures-Wasserstein Flow Matching for Graph Generation

Keyue Jiang, Jiahao Cui, Xiaowen Dong, Laura Toni

TL;DR

This work tackles the challenge of generating graphs by constructing a probability path that respects the joint evolution of nodes and edges, rather than interpolating components independently with linear methods. It introduces BWFlow, which models graphs as GraphMRFs and uses Bures-Wasserstein optimal transport to derive a smooth, globally coherent interpolation and velocity field along the graph manifold. The framework unifies continuous and discrete graph generation through closed-form BW interpolants and velocities, leading to improved training stability and sampling efficiency, with strong results on plain graphs and molecular datasets. By aligning graph generation with OT on non-Euclidean graph spaces, BWFlow offers a principled, scalable approach to joint graph structure and feature evolution, with promising directions for multi-relational graphs and efficiency refinements.

Abstract

Graph generation has emerged as a critical task in fields ranging from drug discovery to circuit design. Contemporary approaches, notably diffusion and flow-based models, have achieved solid graph generative performance through constructing a probability path that interpolates between reference and data distributions. However, these methods typically model the evolution of individual nodes and edges independently and use linear interpolations to build the path. This disentangled interpolation breaks the interconnected patterns of graphs, making the constructed probability path irregular and non-smooth, which causes poor training dynamics and faulty sampling convergence. To address the limitation, this paper first presents a theoretically grounded framework for probability path construction in graph generative models. Specifically, we model the joint evolution of the nodes and edges by representing graphs as connected systems parameterized by Markov random fields (MRF). We then leverage the optimal transport displacement between MRF objects to design a smooth probability path that ensures the co-evolution of graph components. Based on this, we introduce BWFlow, a flow-matching framework for graph generation that utilizes the derived optimal probability path to benefit the training and sampling algorithm design. Experimental evaluations in plain graph generation and molecule generation validate the effectiveness of BWFlow with competitive performance, better training convergence, and efficient sampling.

Bures-Wasserstein Flow Matching for Graph Generation

TL;DR

This work tackles the challenge of generating graphs by constructing a probability path that respects the joint evolution of nodes and edges, rather than interpolating components independently with linear methods. It introduces BWFlow, which models graphs as GraphMRFs and uses Bures-Wasserstein optimal transport to derive a smooth, globally coherent interpolation and velocity field along the graph manifold. The framework unifies continuous and discrete graph generation through closed-form BW interpolants and velocities, leading to improved training stability and sampling efficiency, with strong results on plain graphs and molecular datasets. By aligning graph generation with OT on non-Euclidean graph spaces, BWFlow offers a principled, scalable approach to joint graph structure and feature evolution, with promising directions for multi-relational graphs and efficiency refinements.

Abstract

Graph generation has emerged as a critical task in fields ranging from drug discovery to circuit design. Contemporary approaches, notably diffusion and flow-based models, have achieved solid graph generative performance through constructing a probability path that interpolates between reference and data distributions. However, these methods typically model the evolution of individual nodes and edges independently and use linear interpolations to build the path. This disentangled interpolation breaks the interconnected patterns of graphs, making the constructed probability path irregular and non-smooth, which causes poor training dynamics and faulty sampling convergence. To address the limitation, this paper first presents a theoretically grounded framework for probability path construction in graph generative models. Specifically, we model the joint evolution of the nodes and edges by representing graphs as connected systems parameterized by Markov random fields (MRF). We then leverage the optimal transport displacement between MRF objects to design a smooth probability path that ensures the co-evolution of graph components. Based on this, we introduce BWFlow, a flow-matching framework for graph generation that utilizes the derived optimal probability path to benefit the training and sampling algorithm design. Experimental evaluations in plain graph generation and molecule generation validate the effectiveness of BWFlow with competitive performance, better training convergence, and efficient sampling.

Paper Structure

This paper contains 83 sections, 10 theorems, 117 equations, 9 figures, 17 tables, 4 algorithms.

Key Result

Proposition 1

Consider two same-sized graphs ${\mathcal{G}}_0 \sim p\left({\mathcal{X}}_0, {\mathcal{E}}_0\right)$ and ${\mathcal{G}}_1 \sim p\left({\mathcal{X}}_1, {\mathcal{E}}_1\right)$ with ${\bm{V}}$ shared for two graphs, described by the distribution in dft:GMRF. When the graphs are equipped with graph Lap as $\nu \rightarrow 0$ and $\beta$ is a constant related to the norm of ${\bm{V}}^\dagger$. The pro

Figures (9)

  • Figure 1: Probability path visualization. Since the probability is intractable, the average maximum mean discrepancy ratio (y-axis) of graph statistics between interpolants and the data points is used as a proxy for the probability. Lower means closer to the data distribution (details in \ref{['apdx:exp_plain']}).
  • Figure 2: Schematic overview of BWFlow, which consists of: a) Sample the marginal graph condition $G_0$ and $G_1$; b) Convert graphs to MRFs; c) Interpolate to get intermediate points; d) Convert back to get $G_t$; e) Train velocity based on $G_t$; and f) Generate new points with the trained velocity.
  • Figure 3: Ablation studies for Bures-Wasserstein Flow Matching.
  • Figure 4: Techniques for manipulating probability path.
  • Figure 5: BW probability paths for planar and tree graphs.
  • ...and 4 more figures

Theorems & Definitions (13)

  • Definition 1: Wasserstein Distance
  • Definition 2: Graph Markov Random Fields
  • Proposition 1: Bures-Wasserstein Distance
  • Proposition 2: Bures-Wasserstein interpolation
  • Proposition 3: Bures-Wasserstein velocity
  • Definition 3: Graph Markov Random Fields
  • Lemma 1
  • Lemma 2
  • Proposition 4
  • Proposition 5
  • ...and 3 more