Efficient Diffusion Models for Symmetric Manifolds
Oren Mangoubi, Neil He, Nisheeth K. Vishnoi
TL;DR
This work tackles the high computational cost of diffusion models on non-Euclidean symmetric manifolds by introducing a projection-based forward diffusion with a spatially varying covariance that bypasses the heat-kernel. By applying Itô's lemma, the authors derive a tractable training objective requiring only $O(1)$ gradient evaluations per step and enabling per-iteration arithmetic that scales near-linearly in the ambient dimension, e.g., $O(d^{1.19})$ for $ ext{SO}(n)$ and $ ext{U}(n)$, or $O(d)$ for the torus and sphere. They prove an average-case Lipschitz condition under manifold symmetries to obtain sampling guarantees with polynomial-in-$d$ accuracy and runtime, using an optimal transport-based coupling rather than Girsanov's transformation. Empirically, the method delivers substantial speedups and improved sample quality on synthetic datasets over $ ext{SO}(n)$, $ ext{U}(n)$, torus, and sphere, narrowing the gap to Euclidean diffusion models in both efficiency and performance.
Abstract
We introduce a framework for designing efficient diffusion models for $d$-dimensional symmetric-space Riemannian manifolds, including the torus, sphere, special orthogonal group and unitary group. Existing manifold diffusion models often depend on heat kernels, which lack closed-form expressions and require either $d$ gradient evaluations or exponential-in-$d$ arithmetic operations per training step. We introduce a new diffusion model for symmetric manifolds with a spatially-varying covariance, allowing us to leverage a projection of Euclidean Brownian motion to bypass heat kernel computations. Our training algorithm minimizes a novel efficient objective derived via Ito's Lemma, allowing each step to run in $O(1)$ gradient evaluations and nearly-linear-in-$d$ ($O(d^{1.19})$) arithmetic operations, reducing the gap between diffusions on symmetric manifolds and Euclidean space. Manifold symmetries ensure the diffusion satisfies an "average-case" Lipschitz condition, enabling accurate and efficient sample generation. Empirically, our model outperforms prior methods in training speed and improves sample quality on synthetic datasets on the torus, special orthogonal group, and unitary group.
