Table of Contents
Fetching ...

UniMoMo: Unified Generative Modeling of 3D Molecules for De Novo Binder Design

Xiangzhe Kong, Zishen Zhang, Ziting Zhang, Rui Jiao, Jianzhu Ma, Wenbing Huang, Kai Liu, Yang Liu

TL;DR

UniMoMo addresses the fragmentation of binder design across molecular domains by unifying peptides, antibodies, and small molecules as graphs of blocks and applying a geometric latent diffusion model. It combines an iterative full-atom autoencoder with a diffusion process in a compressed latent space to generate 3D binders conditioned on a binding site, enabling cross-domain transfer. Across peptides, antibodies, small molecules, and a GPCR case, UniMoMo with all-domain training outperforms domain-specific baselines and demonstrates transferable interaction patterns. The approach offers a scalable path toward exploring diverse molecular formats for a single target, leveraging larger, more diverse data to improve design quality and generalization.

Abstract

The design of target-specific molecules such as small molecules, peptides, and antibodies is vital for biological research and drug discovery. Existing generative methods are restricted to single-domain molecules, failing to address versatile therapeutic needs or utilize cross-domain transferability to enhance model performance. In this paper, we introduce Unified generative Modeling of 3D Molecules (UniMoMo), the first framework capable of designing binders of multiple molecular domains using a single model. In particular, UniMoMo unifies the representations of different molecules as graphs of blocks, where each block corresponds to either a standard amino acid or a molecular fragment. Subsequently, UniMoMo utilizes a geometric latent diffusion model for 3D molecular generation, featuring an iterative full-atom autoencoder to compress blocks into latent space points, followed by an E(3)-equivariant diffusion process. Extensive benchmarks across peptides, antibodies, and small molecules demonstrate the superiority of our unified framework over existing domain-specific models, highlighting the benefits of multi-domain training.

UniMoMo: Unified Generative Modeling of 3D Molecules for De Novo Binder Design

TL;DR

UniMoMo addresses the fragmentation of binder design across molecular domains by unifying peptides, antibodies, and small molecules as graphs of blocks and applying a geometric latent diffusion model. It combines an iterative full-atom autoencoder with a diffusion process in a compressed latent space to generate 3D binders conditioned on a binding site, enabling cross-domain transfer. Across peptides, antibodies, small molecules, and a GPCR case, UniMoMo with all-domain training outperforms domain-specific baselines and demonstrates transferable interaction patterns. The approach offers a scalable path toward exploring diverse molecular formats for a single target, leveraging larger, more diverse data to improve design quality and generalization.

Abstract

The design of target-specific molecules such as small molecules, peptides, and antibodies is vital for biological research and drug discovery. Existing generative methods are restricted to single-domain molecules, failing to address versatile therapeutic needs or utilize cross-domain transferability to enhance model performance. In this paper, we introduce Unified generative Modeling of 3D Molecules (UniMoMo), the first framework capable of designing binders of multiple molecular domains using a single model. In particular, UniMoMo unifies the representations of different molecules as graphs of blocks, where each block corresponds to either a standard amino acid or a molecular fragment. Subsequently, UniMoMo utilizes a geometric latent diffusion model for 3D molecular generation, featuring an iterative full-atom autoencoder to compress blocks into latent space points, followed by an E(3)-equivariant diffusion process. Extensive benchmarks across peptides, antibodies, and small molecules demonstrate the superiority of our unified framework over existing domain-specific models, highlighting the benefits of multi-domain training.

Paper Structure

This paper contains 38 sections, 13 equations, 6 figures, 12 tables, 5 algorithms.

Figures (6)

  • Figure 1: Given the binding site on the target protein, our UniMoMo is capable of generating diverse molecular binders, including peptides, antibodies, and small molecules.
  • Figure 2: Overview of our UniMoMo. (A) Graph of blocks as the unified representation for peptides, antibodies, and small molecules. (B) The proposed unified generative framework for 3D molecular binder design involves an iterative full-atom autoencoder and a diffusion model implemented in the latent space. The autoencoder compresses the atomic details of each block into single latent points and reconstructs them by first predicting block types for the latent points, followed by iterative generation of the full-atom geometries.
  • Figure 3: Different types of binders generated by UniMoMo on the same binding site on a GPCR (PDB ID: 8U4R). The distribution of Rosetta interface energy is calculated for 100 generated peptides and antibodies. And Vina score is used for evaluation of the 100 generated small molecules. The orange box on the molecular topology graph highlights a structure similar to the side chain of Arginine, while the red boxes denote amide connections.
  • Figure 4: Validation loss curves for the latent diffusion model trained on VAE latent spaces with varying KL weights, where $\lambda_1$ and $\lambda_2$ indicate the KL weights on the invariant and the equivariant latent variables, respectively, as in Eq. \ref{['eq:kl']}.
  • Figure 5: Designed peptides, antibodies, and small molecules, as well as their in silico binding affinity distributions, for the same binding site on (A) SARS receptor binding domain (PDB ID: 2GHW) and (B) cluster of differentiation 38 (PDB ID: 4CMH).
  • ...and 1 more figures