SemlaFlow -- Efficient 3D Molecular Generation with Latent Attention and Equivariant Flow Matching
Ross Irwin, Alessandro Tibo, Jon Paul Janet, Simon Olsson
TL;DR
The paper tackles the bottleneck of slow sampling and limited chemical validity in unconditional 3D molecular generation. It introduces Semla, an $E(3)$-equivariant latent-attention architecture, and SemlaFlow, a flow-matching model that jointly generates topology and 3D conformations. SemlaFlow achieves state-of-the-art results with as few as 20 sampling steps, offering substantial speedups and proposing energy-based benchmarks to better assess conformer quality. The work highlights evaluation gaps in 3D generation and demonstrates that combining latent attention with equivariant flow matching yields practical, scalable molecules suitable for downstream drug design tasks.
Abstract
Methods for jointly generating molecular graphs along with their 3D conformations have gained prominence recently due to their potential impact on structure-based drug design. Current approaches, however, often suffer from very slow sampling times or generate molecules with poor chemical validity. Addressing these limitations, we propose Semla, a scalable E(3)-equivariant message passing architecture. We further introduce an unconditional 3D molecular generation model, SemlaFlow, which is trained using equivariant flow matching to generate a joint distribution over atom types, coordinates, bond types and formal charges. Our model produces state-of-the-art results on benchmark datasets with as few as 20 sampling steps, corresponding to a two order-of-magnitude speedup compared to state-of-the-art. Furthermore, we highlight limitations of current evaluation methods for 3D generation and propose new benchmark metrics for unconditional molecular generators. Finally, using these new metrics, we compare our model's ability to generate high quality samples against current approaches and further demonstrate SemlaFlow's strong performance.
