Trivialized Momentum Facilitates Diffusion Generative Modeling on Lie Groups
Yuchen Zhu, Tianrong Chen, Lingkai Kong, Evangelos A. Theodorou, Molei Tao
TL;DR
This work tackles diffusion-based generative modeling for data on Lie groups by introducing Trivialized Diffusion Model (TDM), which uses left-trivialization to keep momentum in the fixed Lie algebra $\mathfrak{g}$ and learn a score in a Euclidean space. The forward process on a Lie group $G$ follows $\dot{g_t}=T_eL_{g_t}\xi_t$ with $\mathrm{d}\xi_t = -\gamma(t)\xi_tdt + \sqrt{2\gamma(t)}\mathrm{d}W_t^{\mathfrak{g}}$, and the time-reversed backward process uses $\nabla_{\xi}\log p_{T-t}(g_t,\xi_t)$, enabling efficient, projection-free sampling via an Operator Splitting Integrator (OSI). Likelihood training employs DSM or ISM, with explicit conditional transitions for Abelian bases like $\mathsf{SO}(2)$ or $\mathbb{T}$, and a fixed Euclidean score network $s_{\theta}$ acting in $\mathfrak{g}$. Empirically, TDM achieves state-of-the-art results on protein/RNA torsion angles, challenging torus datasets (Pacman, checkerboard), and scales to high-dimensional $\mathsf{SO}(n)$ and $\mathsf{U}(n)$ data, including quantum time-evolution operators, with code publicly available. This advances scalable, high-fidelity generative modeling on manifolds by avoiding the typical manifold-projection errors that plague prior approaches.
Abstract
The generative modeling of data on manifolds is an important task, for which diffusion models in flat spaces typically need nontrivial adaptations. This article demonstrates how a technique called `trivialization' can transfer the effectiveness of diffusion models in Euclidean spaces to Lie groups. In particular, an auxiliary momentum variable was algorithmically introduced to help transport the position variable between data distribution and a fixed, easy-to-sample distribution. Normally, this would incur further difficulty for manifold data because momentum lives in a space that changes with the position. However, our trivialization technique creates a new momentum variable that stays in a simple fixed vector space. This design, together with a manifold preserving integrator, simplifies implementation and avoids inaccuracies created by approximations such as projections to tangent space and manifold, which were typically used in prior work, hence facilitating generation with high-fidelity and efficiency. The resulting method achieves state-of-the-art performance on protein and RNA torsion angle generation and sophisticated torus datasets. We also, arguably for the first time, tackle the generation of data on high-dimensional Special Orthogonal and Unitary groups, the latter essential for quantum problems. Code is available at https://github.com/yuchen-zhu-zyc/TDM.
