Frame-based Equivariant Diffusion Models for 3D Molecular Generation
Mohan Guo, Cong Liu, Patrick Forré
TL;DR
Frame-based diffusion addresses the challenge of enforcing $\mathbb{E}(3)$ symmetry in molecular generation by decoupling symmetry handling from the backbone. It introduces Global Frame Diffusion, Local Frame Diffusion, and Invariant Frame Diffusion, combined with EdgeDiT backbones, achieving deterministic equivariance while preserving global geometry. On QM9, GFD with EdgeDiT achieves state-of-the-art negative log-likelihood and high atom/molecular stability, with nearly 2x faster sampling than EDM; LFD benefits from a frame-alignment constraint, while IFD sacrifices some diversity but gains efficiency. The work establishes frame-based diffusion as scalable and physically grounded, highlighting the importance of global structure preservation for effective molecular learning.
Abstract
Recent methods for molecular generation face a trade-off: they either enforce strict equivariance with costly architectures or relax it to gain scalability and flexibility. We propose a frame-based diffusion paradigm that achieves deterministic E(3)-equivariance while decoupling symmetry handling from the backbone. Building on this paradigm, we investigate three variants: Global Frame Diffusion (GFD), which assigns a shared molecular frame; Local Frame Diffusion (LFD), which constructs node-specific frames and benefits from additional alignment constraints; and Invariant Frame Diffusion (IFD), which relies on pre-canonicalized invariant representations. To enhance expressivity, we further utilize EdgeDiT, a Diffusion Transformer with edge-aware attention. On the QM9 dataset, GFD with EdgeDiT achieves state-of-the-art performance, with a test NLL of -137.97 at standard scale and -141.85 at double scale, alongside atom stability of 98.98%, and molecular stability of 90.51%. These results surpass all equivariant baselines while maintaining high validity and uniqueness and nearly 2x faster sampling compared to EDM. Altogether, our study establishes frame-based diffusion as a scalable, flexible, and physically grounded paradigm for molecular generation, highlighting the critical role of global structure preservation.
