Diffusion-Driven Generative Framework for Molecular Conformation Prediction
Bobin Yang, Jie Deng, Zhenghan Chen, Ruoxue Wu
TL;DR
This work tackles 3D molecular conformation generation from 2D graph representations by introducing DDGF, a diffusion-driven, end-to-end framework that jointly learns distance prediction and coordinate reconstruction under a bilevel optimization scheme. The method employs a conditional CVAE with a diffusion forward process and a learnable reverse process conditioned on the molecular graph, ensuring roto-translational invariance through an equivariant architecture and distance-geometry-based reconstruction. Empirical results on GEOM-QM9 and GEOM-Drugs show DDGF achieves state-of-the-art coverage and accuracy in conformational generation and delivers competitive property predictions, with notable gains when combined with force-field refinements. Overall, DDGF advances molecular modeling by unifying diffusion, equivariant graph learning, and bilevel optimization to produce diverse, physically plausible conformations at scale, with direct implications for drug discovery and materials design.
Abstract
The task of deducing three-dimensional molecular configurations from their two-dimensional graph representations holds paramount importance in the fields of computational chemistry and pharmaceutical development. The rapid advancement of machine learning, particularly within the domain of deep generative networks, has revolutionized the precision of predictive modeling in this context. Traditional approaches often adopt a two-step strategy: initially estimating interatomic distances and subsequently refining the spatial molecular structure by solving a distance geometry problem. However, this sequential approach occasionally falls short in accurately capturing the intricacies of local atomic arrangements, thereby compromising the fidelity of the resulting structural models. Addressing these limitations, this research introduces a cutting-edge generative framework named DDGF. This framework is grounded in the principles of diffusion observed in classical non-equilibrium thermodynamics. DDGF views atoms as discrete entities and excels in guiding the reversal of diffusion, transforming a distribution of stochastic noise back into coherent molecular structures through a process akin to a Markov chain. This transformation commences with the initial representation of a molecular graph in an abstract latent space, culminating in the realization of three-dimensional structures via a sophisticated bilevel optimization scheme meticulously tailored to meet the specific requirements of the task. One of the formidable challenges in this modeling endeavor involves preserving roto-translational invariance to ensure that the generated molecular conformations adhere to the laws of physics. Extensive experimental evaluations confirm the efficacy of the proposed DDGF in comparison to state-of-the-art methods.
