Sampling 3D Molecular Conformers with Diffusion Transformers
J. Thorben Frank, Winfried Ripken, Gregor Lied, Klaus-Robert Müller, Oliver T. Unke, Stefan Chmiela
TL;DR
This work introduces DiTMC, a diffusion-transformer framework for sampling molecular conformers conditioned on molecular graphs. It couples graph-aware conditioning tokens with multiple positional embedding schemes, including an SO(3)-equivariant PE(3), to model a velocity field $v^ heta(oldsymbol{x},t,oldsymbol{G})$ that guides SE(3)-invariant conformer generation. Empirical results on GEOM-QM9, GEOM-DRUGS, and GEOM-XL show state-of-the-art precision and physical validity, with equivariant variants offering higher fidelity at increased compute cost. Ablations reveal the importance of geodesic-based, all-pair conditioning and the impact of symmetry priors on sample quality and efficiency, suggesting scalable, symmetry-aware pathways for large-scale molecular conformer generation. The work highlights promising directions for fast equivariant attention and broader de-novo molecular design within diffusion-based frameworks.
Abstract
Diffusion Transformers (DiTs) have demonstrated strong performance in generative modeling, particularly in image synthesis, making them a compelling choice for molecular conformer generation. However, applying DiTs to molecules introduces novel challenges, such as integrating discrete molecular graph information with continuous 3D geometry, handling Euclidean symmetries, and designing conditioning mechanisms that generalize across molecules of varying sizes and structures. We propose DiTMC, a framework that adapts DiTs to address these challenges through a modular architecture that separates the processing of 3D coordinates from conditioning on atomic connectivity. To this end, we introduce two complementary graph-based conditioning strategies that integrate seamlessly with the DiT architecture. These are combined with different attention mechanisms, including both standard non-equivariant and SO(3)-equivariant formulations, enabling flexible control over the trade-off between between accuracy and computational efficiency. Experiments on standard conformer generation benchmarks (GEOM-QM9, -DRUGS, -XL) demonstrate that DiTMC achieves state-of-the-art precision and physical validity. Our results highlight how architectural choices and symmetry priors affect sample quality and efficiency, suggesting promising directions for large-scale generative modeling of molecular structures. Code is available at https://github.com/ML4MolSim/dit_mc.
