Table of Contents
Fetching ...

Fractional Denoising for 3D Molecular Pre-training

Shikun Feng, Yuyan Ni, Yanyan Lan, Zhi-Ming Ma, Wei-Ying Ma

TL;DR

The paper addresses limitations in coordinate denoising for 3D molecular pre-training, notably low sampling coverage and an isotropic force field. It introduces a hybrid noise strategy that perturb dihedral angles and coordinates, coupled with a fractional denoising objective (Frad) that denoise only the coordinate component, thereby enabling learning of an anisotropic force field. Theoretical analyses prove the equivalence of Frad to anisotropic force-field learning and empirical results on QM9 and MD17 establish state-of-the-art performance, with ablations validating the contributions from chemical constraints and task decoupling. The approach yields more robust molecular representations for downstream tasks and opens avenues for broader applications in force-field learning and hybrid denoising methods.

Abstract

Coordinate denoising is a promising 3D molecular pre-training method, which has achieved remarkable performance in various downstream drug discovery tasks. Theoretically, the objective is equivalent to learning the force field, which is revealed helpful for downstream tasks. Nevertheless, there are two challenges for coordinate denoising to learn an effective force field, i.e. low coverage samples and isotropic force field. The underlying reason is that molecular distributions assumed by existing denoising methods fail to capture the anisotropic characteristic of molecules. To tackle these challenges, we propose a novel hybrid noise strategy, including noises on both dihedral angel and coordinate. However, denoising such hybrid noise in a traditional way is no more equivalent to learning the force field. Through theoretical deductions, we find that the problem is caused by the dependency of the input conformation for covariance. To this end, we propose to decouple the two types of noise and design a novel fractional denoising method (Frad), which only denoises the latter coordinate part. In this way, Frad enjoys both the merits of sampling more low-energy structures and the force field equivalence. Extensive experiments show the effectiveness of Frad in molecular representation, with a new state-of-the-art on 9 out of 12 tasks of QM9 and on 7 out of 8 targets of MD17.

Fractional Denoising for 3D Molecular Pre-training

TL;DR

The paper addresses limitations in coordinate denoising for 3D molecular pre-training, notably low sampling coverage and an isotropic force field. It introduces a hybrid noise strategy that perturb dihedral angles and coordinates, coupled with a fractional denoising objective (Frad) that denoise only the coordinate component, thereby enabling learning of an anisotropic force field. Theoretical analyses prove the equivalence of Frad to anisotropic force-field learning and empirical results on QM9 and MD17 establish state-of-the-art performance, with ablations validating the contributions from chemical constraints and task decoupling. The approach yields more robust molecular representations for downstream tasks and opens avenues for broader applications in force-field learning and hybrid denoising methods.

Abstract

Coordinate denoising is a promising 3D molecular pre-training method, which has achieved remarkable performance in various downstream drug discovery tasks. Theoretically, the objective is equivalent to learning the force field, which is revealed helpful for downstream tasks. Nevertheless, there are two challenges for coordinate denoising to learn an effective force field, i.e. low coverage samples and isotropic force field. The underlying reason is that molecular distributions assumed by existing denoising methods fail to capture the anisotropic characteristic of molecules. To tackle these challenges, we propose a novel hybrid noise strategy, including noises on both dihedral angel and coordinate. However, denoising such hybrid noise in a traditional way is no more equivalent to learning the force field. Through theoretical deductions, we find that the problem is caused by the dependency of the input conformation for covariance. To this end, we propose to decouple the two types of noise and design a novel fractional denoising method (Frad), which only denoises the latter coordinate part. In this way, Frad enjoys both the merits of sampling more low-energy structures and the force field equivalence. Extensive experiments show the effectiveness of Frad in molecular representation, with a new state-of-the-art on 9 out of 12 tasks of QM9 and on 7 out of 8 targets of MD17.
Paper Structure (33 sections, 12 theorems, 25 equations, 4 figures, 11 tables, 3 algorithms)

This paper contains 33 sections, 12 theorems, 25 equations, 4 figures, 11 tables, 3 algorithms.

Key Result

Proposition 3.1

Consider adding dihedral angle noise $\Delta\psi\in [0,2\pi)^m$ on the input structure $x_i$. The corresponding coordinate change $\Delta x=x_a-x_i\in \mathbb{R}^{3N}$ is approximately linear with respect to the dihedral angle noise, when the scale of the dihedral angle noise is small. where $C$ is a $3N\times m$ matrix that is dependent on the input conformation, $\{D_j, j=1\cdots m\}$ are const

Figures (4)

  • Figure 1: An illustration of the anisotropy of molecular structures. In low-energy conformations of aspirin, the structure of benzene ring and the carbon-oxygen double bonds are almost fixed, while some single bonds can rotate flexibly.
  • Figure 2: An overview of our method Frad. a: During pre-training, the hybrid noise, combining dihedral angle noise and coordinate noise, is applied to the equilibrium conformation. b: The GNN is trained to predict the coordinate noise, which is a fraction of the hybrid noise. This process is named Frad (Fractional Denoising), and proved to be equivalent to learning an approximate force field. c: We apply Frad during fine-tuning on the MD17 dataset. Specifically, fractional denoising is added as an auxiliary task, which is optimized with the primary property prediction task simultaneously.
  • Figure 3: Illustrations to aid the proof of Proposition \ref{['app_prop:Noise type transformation']}. Left: Three rotatable bonds in aspirin. Middle: When changing the dihedral angle $\psi_1$, the atoms move along a circular arc e.g. $A\rightarrow A'$ and $B \rightarrow B'$. Right: When considering the changing of all dihedral angles, we can define a breadth-first order to traverse all rotatable bonds in the tree structure of aspirin, e.g. $(\psi_1,\psi_3,\psi_2)$, and consider their effects on coordinate one by one and then add the effects together.
  • Figure 4: Illustrations to aid the proof of Lemma \ref{['lm:linear approx error']}

Theorems & Definitions (20)

  • Proposition 3.1: Noise Type Transformation
  • Proposition 3.2: The Conformation Distribution Corresponding to Dihedral Angle Noise
  • Proposition 3.3: The Conformation Distribution Corresponding to Hybrid Noise
  • Proposition 3.4: Fractional Denoising Score Matching
  • Proposition 1.1: Noise type transformation
  • proof
  • Lemma 1.2
  • proof
  • Lemma 1.3
  • proof
  • ...and 10 more