Table of Contents
Fetching ...

FLOWR.root: A flow matching based foundation model for joint multi-purpose structure-aware 3D ligand generation and affinity prediction

Julian Cremer, Tuan Le, Mohammad M. Ghahremanpour, Emilia Sługocka, Filipe Menezes, Djork-Arné Clevert

TL;DR

Flowr.root introduces a $SE(3)$-equivariant flow-matching foundation model that jointly learns protein-pocket structure, ligand geometry, and binding affinity, enabling pocket-conditioned 3D ligand generation with confidence estimation. It deploys a three-stage training pipeline—large-scale pretraining, high-fidelity fine-tuning, and project-specific adaptation—and supports de novo, interaction-guided, and fragment-based generation with inference-time steering via importance sampling to bias toward higher potency. The model achieves state-of-the-art performance in unconditional and pocket-conditioned generation across multiple benchmarks and demonstrates robust affinity predictions for $pIC_{50}$, $pK_i$, $pK_d$, and $pEC_{50}$, while aligning well with quantum-mechanical validation in case studies. This integrated framework offers a practical, adaptable foundation for structure-based drug design, capable of continuous refinement with project data, though it relies on pocket quality and high-fidelity affinity data and benefits from targeted fine-tuning for new SAR landscapes.

Abstract

We present FLOWR:root, an equivariant flow-matching model for pocket-aware 3D ligand generation with joint binding affinity prediction and confidence estimation. The model supports de novo generation, pharmacophore-conditional sampling, fragment elaboration, and multi-endpoint affinity prediction (pIC50, pKi, pKd, pEC50). Training combines large-scale ligand libraries with mixed-fidelity protein-ligand complexes, followed by refinement on curated co-crystal datasets and parameter-efficient finetuning for project-specific adaptation. FLOWR:root achieves state-of-the-art performance in unconditional 3D molecule generation and pocket-conditional ligand design, producing geometrically realistic, low-strain structures. The integrated affinity prediction module demonstrates superior accuracy on the SPINDR test set and outperforms recent models on the Schrodinger FEP+/OpenFE benchmark with substantial speed advantages. As a foundation model, FLOWR:root requires finetuning on project-specific datasets to account for unseen structure-activity landscapes, yielding strong correlation with experimental data. Joint generation and affinity prediction enable inference-time scaling through importance sampling, steering molecular design toward higher-affinity compounds. Case studies validate this: selective CK2$α$ ligand generation against CLK3 shows significant correlation between predicted and quantum-mechanical binding energies, while ER$α$, TYK2 and BACE1 scaffold elaboration demonstrates strong agreement with QM calculations. By integrating structure-aware generation, affinity estimation, and property-guided sampling, FLOWR:root provides a comprehensive foundation for structure-based drug design spanning hit identification through lead optimization.

FLOWR.root: A flow matching based foundation model for joint multi-purpose structure-aware 3D ligand generation and affinity prediction

TL;DR

Flowr.root introduces a -equivariant flow-matching foundation model that jointly learns protein-pocket structure, ligand geometry, and binding affinity, enabling pocket-conditioned 3D ligand generation with confidence estimation. It deploys a three-stage training pipeline—large-scale pretraining, high-fidelity fine-tuning, and project-specific adaptation—and supports de novo, interaction-guided, and fragment-based generation with inference-time steering via importance sampling to bias toward higher potency. The model achieves state-of-the-art performance in unconditional and pocket-conditioned generation across multiple benchmarks and demonstrates robust affinity predictions for , , , and , while aligning well with quantum-mechanical validation in case studies. This integrated framework offers a practical, adaptable foundation for structure-based drug design, capable of continuous refinement with project data, though it relies on pocket quality and high-fidelity affinity data and benefits from targeted fine-tuning for new SAR landscapes.

Abstract

We present FLOWR:root, an equivariant flow-matching model for pocket-aware 3D ligand generation with joint binding affinity prediction and confidence estimation. The model supports de novo generation, pharmacophore-conditional sampling, fragment elaboration, and multi-endpoint affinity prediction (pIC50, pKi, pKd, pEC50). Training combines large-scale ligand libraries with mixed-fidelity protein-ligand complexes, followed by refinement on curated co-crystal datasets and parameter-efficient finetuning for project-specific adaptation. FLOWR:root achieves state-of-the-art performance in unconditional 3D molecule generation and pocket-conditional ligand design, producing geometrically realistic, low-strain structures. The integrated affinity prediction module demonstrates superior accuracy on the SPINDR test set and outperforms recent models on the Schrodinger FEP+/OpenFE benchmark with substantial speed advantages. As a foundation model, FLOWR:root requires finetuning on project-specific datasets to account for unseen structure-activity landscapes, yielding strong correlation with experimental data. Joint generation and affinity prediction enable inference-time scaling through importance sampling, steering molecular design toward higher-affinity compounds. Case studies validate this: selective CK2 ligand generation against CLK3 shows significant correlation between predicted and quantum-mechanical binding energies, while ER, TYK2 and BACE1 scaffold elaboration demonstrates strong agreement with QM calculations. By integrating structure-aware generation, affinity estimation, and property-guided sampling, FLOWR:root provides a comprehensive foundation for structure-based drug design spanning hit identification through lead optimization.

Paper Structure

This paper contains 49 sections, 11 equations, 17 figures, 3 tables.

Figures (17)

  • Figure 1: Overview of the dataset generation pipeline used in this work. Dataset generation workflow comprising data filtering, curation via Schrodinger's LigPrep and PrepWizard, building of metadata-annotated internal representation, and calculation of molecule statistics.
  • Figure 2: Graphical overview of the Flowr.root framework.Flowr.root is a flow matching-based framework for joint prediction of 3D ligand structure, binding affinity, and confidence estimation. The model follows a multi-stage training paradigm: large-scale pre-training on small molecules and mixed-fidelity protein-ligand complexes, followed by high-fidelity dataset training, with optional project-specific domain adaptation. Domain adaptation is enabled through standard or LoRA-based fine-tuning, direct preference alignment, and inference-time scaling via importance sampling with multi-objective guidance. The framework supports flexible conditional generation modes including scaffold-, linker-, core-, interaction-, and functional-group-conditional generation, as well as fragment or custom substructure replacement.
  • Figure 3: Top left: Correlation plot of Flowr.root-predicted pIC$_{50}$ in kcal/mol vs. experimental pIC$_{50}$ binding affinities across protein-ligand complexes on the SPINDR test set, with shaded regions indicating 0.5 and 1 kcal/mol error boundaries, and color denoting density of predictions (the darker the denser). Error bars are reported as standard deviations from five seed runs. Top right: Correlation with experimental pK$_{i}$ affinities. Bottom left Correlation with experimental pK$_{d}$ affinities, and (bottom right) shows the correlation results if the median of all predicted affinities is used.
  • Figure 4: Top left: Inference-time steering via importance sampling on the Spindr test set using the Flowr.root model comparing the distribution of pIC$_{50}$ predictions of generated ligands across test set targets between un-guided, and mild to strongly guided steering. Top right: PCA and UMAP depiction of chemical space comparison between un-guided and strongly guided samples. Bottom rows: Distribution comparison between un-guided and strongly guided samples regarding different chemical properties, namely molecular weight (MW), logP, number of hydrogen donors (HBD) and acceptors (HBA), topological surface area (TPSA), number of rotatable bonds (NumRotBonds) and aromatic rings (NumAromaticRings), fraction of $sp^3$ carbons (Fsp3), druglikeness (QED) and synthesizability (SA Score).
  • Figure 5: Top left: Correlation plot of Flowr.root-predicted pK$_{d}$ in kcal/mol vs. experimental binding affinities across protein-ligand complexes of the Schrodinger FEP+ benchmark dataset, with shaded regions indicating 0.5 and 1 kcal/mol error boundaries, and color denoting density of predictions (darker means more dense). Error bars are reported as standard deviations from five seed runs. Top right: Mean values of different correlation metrics with error bars indicating the 95% confidence interval over the five different seed runs. Bottom left: Correlation plot of Flowr.root-predicted binding affinities as mean over pK$_{d}$, pK$_{i}$, and pIC$_{50}$ in kcal/mol, and respective correlation mean values. Bottom right: Correlation statistics of the combined prediction for all affinity types.
  • ...and 12 more figures