Stiefel Flow Matching for Moment-Constrained Structure Elucidation
Austin Cheng, Alston Lo, Kin Long Kelvin Lee, Santiago Miret, Alán Aspuru-Guzik
TL;DR
Stiefel Flow Matching addresses the problem of structure elucidation from molecular formula and exact moments of inertia by embedding the moment-constrained feasible space into the Stiefel manifold $\mathrm{St}(n,4)$ and learning a Riemannian flow that exactly preserves these constraints. The method leverages reflection and permutation equivariance and introduces an equivariant optimal-transport objective to yield shorter generation paths. Empirically, Stiefel Flow Matching achieves higher success rates and lower sampling costs than Euclidean diffusion baselines on QM9 and GEOM, with OT offering additional gains in trajectory efficiency. The work demonstrates the practical potential of exact manifold constraints for robust 3D molecular structure generation and points to future directions in alternative Riemannian paths, discrete conditioning, and real-world conditioning data.
Abstract
Molecular structure elucidation is a fundamental step in understanding chemical phenomena, with applications in identifying molecules in natural products, lab syntheses, forensic samples, and the interstellar medium. We consider the task of predicting a molecule's all-atom 3D structure given only its molecular formula and moments of inertia, motivated by the ability of rotational spectroscopy to measure these moments. While existing generative models can conditionally sample 3D structures with approximately correct moments, this soft conditioning fails to leverage the many digits of precision afforded by experimental rotational spectroscopy. To address this, we first show that the space of $n$-atom point clouds with a fixed set of moments of inertia is embedded in the Stiefel manifold $\mathrm{St}(n, 4)$. We then propose Stiefel Flow Matching as a generative model for elucidating 3D structure under exact moment constraints. Additionally, we learn simpler and shorter flows by finding approximate solutions for equivariant optimal transport on the Stiefel manifold. Empirically, enforcing exact moment constraints allows Stiefel Flow Matching to achieve higher success rates and faster sampling than Euclidean diffusion models, even on high-dimensional manifolds corresponding to large molecules in the GEOM dataset.
