Table of Contents
Fetching ...

PharMolixFM: All-Atom Foundation Models for Molecular Modeling and Generation

Yizhen Luo, Jiashuo Wang, Siqi Fan, Zaiqing Nie

TL;DR

PharMolixFM addresses the challenge of generalizing all-atom foundations across molecular modeling tasks by introducing a unified multi-modal denoising framework that jointly models atom types and coordinates under SE-$3$ invariance. It implements three variants—multi-modal diffusion, flow matching, and Bayesian flow networks—within a dual-branch SE-$3$-equivariant architecture to enable cross-task transfer via mixed denoising priors. The framework achieves competitive performance in protein-small-molecule docking with substantially faster inference and shows favorable trade-offs in structure-based drug design, while revealing an empirical inference scaling law that guides computation vs. accuracy. This approach holds potential to accelerate discovery in structural biology and pharmacology by enabling robust, scalable all-atom generation across tasks with efficient compute usage.

Abstract

Structural biology relies on accurate three-dimensional biomolecular structures to advance our understanding of biological functions, disease mechanisms, and therapeutics. While recent advances in deep learning have enabled the development of all-atom foundation models for molecular modeling and generation, existing approaches face challenges in generalization due to the multi-modal nature of atomic data and the lack of comprehensive analysis of training and sampling strategies. To address these limitations, we propose PharMolixFM, a unified framework for constructing all-atom foundation models based on multi-modal generative techniques. Our framework includes three variants using state-of-the-art multi-modal generative models. By formulating molecular tasks as a generalized denoising process with task-specific priors, PharMolixFM achieves robust performance across various structural biology applications. Experimental results demonstrate that PharMolixFM-Diff achieves competitive prediction accuracy in protein-small-molecule docking (83.9% vs. 90.2% RMSD < 2Å, given pocket) with significantly improved inference speed. Moreover, we explore the empirical inference scaling law by introducing more sampling repeats or steps. Our code and model are available at https://github.com/PharMolix/OpenBioMed.

PharMolixFM: All-Atom Foundation Models for Molecular Modeling and Generation

TL;DR

PharMolixFM addresses the challenge of generalizing all-atom foundations across molecular modeling tasks by introducing a unified multi-modal denoising framework that jointly models atom types and coordinates under SE- invariance. It implements three variants—multi-modal diffusion, flow matching, and Bayesian flow networks—within a dual-branch SE--equivariant architecture to enable cross-task transfer via mixed denoising priors. The framework achieves competitive performance in protein-small-molecule docking with substantially faster inference and shows favorable trade-offs in structure-based drug design, while revealing an empirical inference scaling law that guides computation vs. accuracy. This approach holds potential to accelerate discovery in structural biology and pharmacology by enabling robust, scalable all-atom generation across tasks with efficient compute usage.

Abstract

Structural biology relies on accurate three-dimensional biomolecular structures to advance our understanding of biological functions, disease mechanisms, and therapeutics. While recent advances in deep learning have enabled the development of all-atom foundation models for molecular modeling and generation, existing approaches face challenges in generalization due to the multi-modal nature of atomic data and the lack of comprehensive analysis of training and sampling strategies. To address these limitations, we propose PharMolixFM, a unified framework for constructing all-atom foundation models based on multi-modal generative techniques. Our framework includes three variants using state-of-the-art multi-modal generative models. By formulating molecular tasks as a generalized denoising process with task-specific priors, PharMolixFM achieves robust performance across various structural biology applications. Experimental results demonstrate that PharMolixFM-Diff achieves competitive prediction accuracy in protein-small-molecule docking (83.9% vs. 90.2% RMSD < 2Å, given pocket) with significantly improved inference speed. Moreover, we explore the empirical inference scaling law by introducing more sampling repeats or steps. Our code and model are available at https://github.com/PharMolix/OpenBioMed.

Paper Structure

This paper contains 31 sections, 19 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: An overview of the PharMolixFM framework. It decomposes molecules and proteins into atoms and performs a denoising generative process on atom coordinates, atom types, and bond types. The network architecture composes a protein branch, a molecule branch, and independent prediction heads and confidence heads to reconstruct the original biomolecules.
  • Figure 2: The training tasks of PharMolixFM. We introduce three tasks by applying different noises to each variable.
  • Figure 3: Investigations on the inference scaling law of PharMolixFM. We show the accuracy on the protein-small-molecule docking task for all PharMolixFM models and the fitted scaling curve using least square method.