Table of Contents
Fetching ...

Rethinking Diffusion Models with Symmetries through Canonicalization with Applications to Molecular Graph Generation

Cai Zhou, Zijie Chen, Zian Li, Jike Wang, Kaiyi Jiang, Pan Li, Rose Yu, Muhan Zhang, Stephen Bates, Tommi Jaakkola

TL;DR

This work reframes diffusion and flow-based generative modeling for symmetric data by canonicalizing inputs to a slice that breaks symmetry, training with non-equivariant backbones, and restoring invariance via Haar randomization at generation. The authors establish a factorization of invariant measures through the canonical slice, prove universality and enhanced expressivity of canonical parameterizations, and show that canonicalization reduces both score mixture and conditional flow variance, speeding up training. For molecular graph generation under $S_N \times SE(3)$, they implement a geometric-spectra–driven canonicalization (via the Fiedler vector) and a novel Canon architecture (CanonFlow) that ingests canonical rank as an extra state, achieving state-of-the-art results on QM9 and GEOM-DRUG with efficient few-step generation. The combination of canonical training, aligned priors, and near-Monge couplings provides a practical, scalable approach that improves generation quality while preserving invariance, offering a principled alternative to fully equivariant models. Overall, canonical diffusion with orbit-space thinking yields faster convergence, better sample quality, and strong empirical gains in 3D molecular generation.

Abstract

Many generative tasks in chemistry and science involve distributions invariant to group symmetries (e.g., permutation and rotation). A common strategy enforces invariance and equivariance through architectural constraints such as equivariant denoisers and invariant priors. In this paper, we challenge this tradition through the alternative canonicalization perspective: first map each sample to an orbit representative with a canonical pose or order, train an unconstrained (non-equivariant) diffusion or flow model on the canonical slice, and finally recover the invariant distribution by sampling a random symmetry transform at generation time. Building on a formal quotient-space perspective, our work provides a comprehensive theory of canonical diffusion by proving: (i) the correctness, universality and superior expressivity of canonical generative models over invariant targets; (ii) canonicalization accelerates training by removing diffusion score complexity induced by group mixtures and reducing conditional variance in flow matching. We then show that aligned priors and optimal transport act complementarily with canonicalization and further improves training efficiency. We instantiate the framework for molecular graph generation under $S_n \times SE(3)$ symmetries. By leveraging geometric spectra-based canonicalization and mild positional encodings, canonical diffusion significantly outperforms equivariant baselines in 3D molecule generation tasks, with similar or even less computation. Moreover, with a novel architecture Canon, CanonFlow achieves state-of-the-art performance on the challenging GEOM-DRUG dataset, and the advantage remains large in few-step generation.

Rethinking Diffusion Models with Symmetries through Canonicalization with Applications to Molecular Graph Generation

TL;DR

This work reframes diffusion and flow-based generative modeling for symmetric data by canonicalizing inputs to a slice that breaks symmetry, training with non-equivariant backbones, and restoring invariance via Haar randomization at generation. The authors establish a factorization of invariant measures through the canonical slice, prove universality and enhanced expressivity of canonical parameterizations, and show that canonicalization reduces both score mixture and conditional flow variance, speeding up training. For molecular graph generation under , they implement a geometric-spectra–driven canonicalization (via the Fiedler vector) and a novel Canon architecture (CanonFlow) that ingests canonical rank as an extra state, achieving state-of-the-art results on QM9 and GEOM-DRUG with efficient few-step generation. The combination of canonical training, aligned priors, and near-Monge couplings provides a practical, scalable approach that improves generation quality while preserving invariance, offering a principled alternative to fully equivariant models. Overall, canonical diffusion with orbit-space thinking yields faster convergence, better sample quality, and strong empirical gains in 3D molecular generation.

Abstract

Many generative tasks in chemistry and science involve distributions invariant to group symmetries (e.g., permutation and rotation). A common strategy enforces invariance and equivariance through architectural constraints such as equivariant denoisers and invariant priors. In this paper, we challenge this tradition through the alternative canonicalization perspective: first map each sample to an orbit representative with a canonical pose or order, train an unconstrained (non-equivariant) diffusion or flow model on the canonical slice, and finally recover the invariant distribution by sampling a random symmetry transform at generation time. Building on a formal quotient-space perspective, our work provides a comprehensive theory of canonical diffusion by proving: (i) the correctness, universality and superior expressivity of canonical generative models over invariant targets; (ii) canonicalization accelerates training by removing diffusion score complexity induced by group mixtures and reducing conditional variance in flow matching. We then show that aligned priors and optimal transport act complementarily with canonicalization and further improves training efficiency. We instantiate the framework for molecular graph generation under symmetries. By leveraging geometric spectra-based canonicalization and mild positional encodings, canonical diffusion significantly outperforms equivariant baselines in 3D molecule generation tasks, with similar or even less computation. Moreover, with a novel architecture Canon, CanonFlow achieves state-of-the-art performance on the challenging GEOM-DRUG dataset, and the advantage remains large in few-step generation.
Paper Structure (77 sections, 20 theorems, 81 equations, 7 figures, 7 tables, 5 algorithms)

This paper contains 77 sections, 20 theorems, 81 equations, 7 figures, 7 tables, 5 algorithms.

Key Result

Theorem 3.1

Suppose Assumptions ass:free and ass:center hold. Let $\mu$ be any $\mathcal{G}$-invariant probability measure on $\mathcal{M}$. Let $\Psi$ be an orbit representative map defined $\mu$-a.s., and let $\nu=\Psi\#\mu$ be the slice distribution on $S=\Psi(\mathcal{M})$. Then Equivalently, if $\tilde{\mathbf{Z}}\sim \nu$, $g\sim\lambda$ independent, then $g\cdot \tilde{\mathbf{Z}}\sim \mu$.

Figures (7)

  • Figure 1: Motivation of Canonical Diffusion: Position-wise supervision induces mismatch loss under symmetries for equivalent molecules in diffusion models.
  • Figure 2: Overview of our canonicalized generation pipeline. (a) Canonicalization: map a molecule $Z$ to a slice representative $\tilde{Z}=\Psi_\phi(Z)$ under $\mathcal{G}=SO(3)\times S_N$, inducing $q_0=(\Psi_\phi)_\# p_0$. (b) Training: learn a diffusion/flow model on $\tilde{Z}_0\sim q_0$ with slice prior $q_1$ (optionally using an OT coupling for aligned pairings). (c) Sampling: generate on the slice (optionally with projected canonical sampling) and then apply $g\sim\mathrm{Haar}(\mathcal{G})$ to obtain an invariant sample $Z=g\to \hat{Z}_0$.
  • Figure 3: Training trajectories of our canonicalized model (visualization of the learned transport dynamics).
  • Figure 4: OT and canonicalization act complementarily. OT can reduce conditional variances with or without canonicalization, but OT solutions are generally non-unique in the presence of symmetry (if $\gamma$ is optimal then $(g,g)_\#\gamma$ is also optimal); averaging yields a diagonal-invariant optimum but may destroy Monge-ness. Canonicalization fixes a gauge (collapsing each orbit to a unique slice representative), eliminating group ambiguity and stabilizing OT on the slice.
  • Figure 5: Spectral canonicalization via the (signed) Fiedler vector produces stable, locality-preserving atom orderings for both highly symmetric and weakly symmetric molecules (blue: start, red: end), demonstrating robust performance across diverse symmetry regimes.
  • ...and 2 more figures

Theorems & Definitions (45)

  • Definition 2.1: Orbit and quotient
  • Definition 2.2: Invariance
  • Definition 2.3: Canonicalization map and slice
  • Remark 2.4: Stabilizers and non-uniqueness
  • Theorem 3.1: Factorization of invariant measures; known
  • Corollary 3.2: Sufficiency of slice modeling
  • Proposition 3.3: Universality of canonicalized parameterzations over invariant and equivariant targets
  • Remark 3.4
  • Theorem 3.5: Variance decomposition under group-aligned lift
  • Remark 3.6: Only non-equivariant models benefit from canonicalization
  • ...and 35 more