Table of Contents
Fetching ...

Learning Disentangled Equivariant Representation for Explicitly Controllable 3D Molecule Generation

Haoran Liu, Youzhi Luo, Tianxiao Li, James Caverlee, Martin Renqiang Min

TL;DR

This work tackles explicit control in 3D drug-like molecule generation by introducing E3WAE, an E(3)-equivariant Wasserstein autoencoder that factorizes the latent space into a property latent and a structure-context latent. A novel coordinate-alignment-based loss enables autoregressive, fragment-based 3D generation without external references, while a Wasserstein regularization enforces disentanglement via $\mathcal{L}_{Dis}$ and an auxiliary property predictor ensures $\mathbf{z}_p$ encodes target properties. The authors demonstrate property-targeting and context-preserving generation on GEOM-Drugs and CrossDocked2020 datasets, achieving state-of-the-art or comparable performance on multiple properties (e.g., asphericity, QED, SAS, logP) and showing improved structure fidelity over baselines like EDM and HierDiff. They also provide a theoretical framework for disentanglement guarantees, extensive ablations, and practical demonstrations on structure-based drug design tasks, highlighting the method’s potential for precise, design-driven molecular discovery. Overall, E3WAE offers a principled pathway to controllable 3D molecule generation with explicit latent-space manipulation, enabling targeted property optimization while preserving structural context for real-world drug design applications.

Abstract

We consider the conditional generation of 3D drug-like molecules with \textit{explicit control} over molecular properties such as drug-like properties (e.g., Quantitative Estimate of Druglikeness or Synthetic Accessibility score) and effectively binding to specific protein sites. To tackle this problem, we propose an E(3)-equivariant Wasserstein autoencoder and factorize the latent space of our generative model into two disentangled aspects: molecular properties and the remaining structural context of 3D molecules. Our model ensures explicit control over these molecular attributes while maintaining equivariance of coordinate representation and invariance of data likelihood. Furthermore, we introduce a novel alignment-based coordinate loss to adapt equivariant networks for auto-regressive de-novo 3D molecule generation from scratch. Extensive experiments validate our model's effectiveness on property-guided and context-guided molecule generation, both for de-novo 3D molecule design and structure-based drug discovery against protein targets.

Learning Disentangled Equivariant Representation for Explicitly Controllable 3D Molecule Generation

TL;DR

This work tackles explicit control in 3D drug-like molecule generation by introducing E3WAE, an E(3)-equivariant Wasserstein autoencoder that factorizes the latent space into a property latent and a structure-context latent. A novel coordinate-alignment-based loss enables autoregressive, fragment-based 3D generation without external references, while a Wasserstein regularization enforces disentanglement via and an auxiliary property predictor ensures encodes target properties. The authors demonstrate property-targeting and context-preserving generation on GEOM-Drugs and CrossDocked2020 datasets, achieving state-of-the-art or comparable performance on multiple properties (e.g., asphericity, QED, SAS, logP) and showing improved structure fidelity over baselines like EDM and HierDiff. They also provide a theoretical framework for disentanglement guarantees, extensive ablations, and practical demonstrations on structure-based drug design tasks, highlighting the method’s potential for precise, design-driven molecular discovery. Overall, E3WAE offers a principled pathway to controllable 3D molecule generation with explicit latent-space manipulation, enabling targeted property optimization while preserving structural context for real-world drug design applications.

Abstract

We consider the conditional generation of 3D drug-like molecules with \textit{explicit control} over molecular properties such as drug-like properties (e.g., Quantitative Estimate of Druglikeness or Synthetic Accessibility score) and effectively binding to specific protein sites. To tackle this problem, we propose an E(3)-equivariant Wasserstein autoencoder and factorize the latent space of our generative model into two disentangled aspects: molecular properties and the remaining structural context of 3D molecules. Our model ensures explicit control over these molecular attributes while maintaining equivariance of coordinate representation and invariance of data likelihood. Furthermore, we introduce a novel alignment-based coordinate loss to adapt equivariant networks for auto-regressive de-novo 3D molecule generation from scratch. Extensive experiments validate our model's effectiveness on property-guided and context-guided molecule generation, both for de-novo 3D molecule design and structure-based drug discovery against protein targets.

Paper Structure

This paper contains 38 sections, 20 equations, 7 figures, 9 tables, 1 algorithm.

Figures (7)

  • Figure 1: An illustration of the proposed E3WAE framework. A 3D molecule is encoded into two disentangled latent variables: the property variable $\mathbf{z}_p$ and the structural context variable $\mathbf{z}_s$ with two E(3)-equivariant encoders, respectively. The latent variables are then combined to reconstruct the molecule in an auto-regressive manner. A prediction head with the supervision of property labels is attached to $\mathbf{z}_p$ to ensure $\mathbf{z}_p$ carries information related to the property. Disentanglement of the property and structure variables is achieved through a Wasserstein autoencoder regularization loss and minimization of the MMD distance against an isotropic Gaussian distribution. The overall training objective combines a reconstruction loss, a property prediction loss, and the Wasserstein loss.
  • Figure 2: t-SNE visualization of the model's disentangled latent spaces, colored by ground-truth property values.
  • Figure 3: An illustration of the reconstruction process.
  • Figure 4: Comparison of training coordinate loss curves: proposed coordinate loss vs. original log-MSE loss without structural alignment.
  • Figure 5: Visualized 3D conformations generated by EDM.
  • ...and 2 more figures