Table of Contents
Fetching ...

Torsional Diffusion for Molecular Conformer Generation

Bowen Jing, Gabriele Corso, Jeffrey Chang, Regina Barzilay, Tommi Jaakkola

TL;DR

The paper introduces torsional diffusion, a diffusion model restricted to torsion angles on a hypertorus, paired with an extrinsic-to-intrinsic SE(3)-equivariant score network to generate molecular conformers. By operating on the torsional degrees of freedom and preserving fixed non-torsional structure, it achieves state-of-the-art ensemble quality with far fewer denoising steps than Euclidean diffusion methods, and provides exact likelihoods enabling Boltzmann sampling across unseen molecules. The approach yields a generalizable torsional Boltzmann generator and demonstrates strong performance on GEOM-DRUGS and related datasets, highlighting practical impact for fast, accurate conformer generation and energy-aware sampling. Limitations include reliance on local-structure priors for rings and challenges around E/Z isomerism, with future work aimed at relaxing rigid local structures and extending to larger systems such as macrocycles and proteins.

Abstract

Molecular conformer generation is a fundamental task in computational chemistry. Several machine learning approaches have been developed, but none have outperformed state-of-the-art cheminformatics methods. We propose torsional diffusion, a novel diffusion framework that operates on the space of torsion angles via a diffusion process on the hypertorus and an extrinsic-to-intrinsic score model. On a standard benchmark of drug-like molecules, torsional diffusion generates superior conformer ensembles compared to machine learning and cheminformatics methods in terms of both RMSD and chemical properties, and is orders of magnitude faster than previous diffusion-based models. Moreover, our model provides exact likelihoods, which we employ to build the first generalizable Boltzmann generator. Code is available at https://github.com/gcorso/torsional-diffusion.

Torsional Diffusion for Molecular Conformer Generation

TL;DR

The paper introduces torsional diffusion, a diffusion model restricted to torsion angles on a hypertorus, paired with an extrinsic-to-intrinsic SE(3)-equivariant score network to generate molecular conformers. By operating on the torsional degrees of freedom and preserving fixed non-torsional structure, it achieves state-of-the-art ensemble quality with far fewer denoising steps than Euclidean diffusion methods, and provides exact likelihoods enabling Boltzmann sampling across unseen molecules. The approach yields a generalizable torsional Boltzmann generator and demonstrates strong performance on GEOM-DRUGS and related datasets, highlighting practical impact for fast, accurate conformer generation and energy-aware sampling. Limitations include reliance on local-structure priors for rings and challenges around E/Z isomerism, with future work aimed at relaxing rigid local structures and extending to larger systems such as macrocycles and proteins.

Abstract

Molecular conformer generation is a fundamental task in computational chemistry. Several machine learning approaches have been developed, but none have outperformed state-of-the-art cheminformatics methods. We propose torsional diffusion, a novel diffusion framework that operates on the space of torsion angles via a diffusion process on the hypertorus and an extrinsic-to-intrinsic score model. On a standard benchmark of drug-like molecules, torsional diffusion generates superior conformer ensembles compared to machine learning and cheminformatics methods in terms of both RMSD and chemical properties, and is orders of magnitude faster than previous diffusion-based models. Moreover, our model provides exact likelihoods, which we employ to build the first generalizable Boltzmann generator. Code is available at https://github.com/gcorso/torsional-diffusion.
Paper Structure (61 sections, 7 theorems, 36 equations, 10 figures, 9 tables, 4 algorithms)

This paper contains 61 sections, 7 theorems, 36 equations, 10 figures, 9 tables, 4 algorithms.

Key Result

Proposition 1

Let $(b_i, c_i)$ be a rotatable bond, let $\mathbf{x}_{\mathcal{V}(b_i)}$ be the positions of atoms on the $b_i$ side of the molecule, and let $R(\boldsymbol{\theta}, x_{c_i}) \in SE(3)$ be the rotation by Euler vector $\boldsymbol{\theta}$ about $x_{c_i}$. Then for $C, C' \in \mathcal{C}_G$, if $\t where $\mathbf{\hat{r}}_{b_ic_i} = (x_{c_i} - x_{b_i})/||x_{c_i}-x_{b_i}||$.

Figures (10)

  • Figure 1: Overview of torsional diffusion.Left: Extrinsic and intrinsic views of torsional diffusion (only 2 dimensions/bonds shown). Right: In a step of reverse diffusion (A), the current conformer is provided as a 3D structure (B) to the score model, which predicts intrinsic torsional updates (C). The final layer of the score model is constructed to resemble a torque computation around each bond (D). $Y$ refers to the spherical harmonics and $V_b$ the learned atomic embeddings.
  • Figure 2: A: The torsion $\tau$ around a bond depends on a choice of neighbors. B: The change$\Delta\tau$ caused by a relative rotation is the same for all choices. C: The sign of $\Delta\tau$ is unambiguous because given the same neighbors, $\tau$ does not depend on bond direction.
  • Figure 3: Mean coverage for recall (left) and precision (right) when varying the threshold value $\delta$ on GEOM-DRUGS.
  • Figure 4: Overview of the architecture and visual intuition of the pseudotorque layer.
  • Figure 5: Histogram of the errors in 15000 predicted bond lengths and angles from randomly sampled molecules in GEOM-DRUGS and GEOM-QM9.
  • ...and 5 more figures

Theorems & Definitions (10)

  • Proposition 1
  • Proposition 2
  • Proposition 2
  • Proposition 3
  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • proof