FLOWR.root: A flow matching based foundation model for joint multi-purpose structure-aware 3D ligand generation and affinity prediction
Julian Cremer, Tuan Le, Mohammad M. Ghahremanpour, Emilia Sługocka, Filipe Menezes, Djork-Arné Clevert
TL;DR
Flowr.root introduces a $SE(3)$-equivariant flow-matching foundation model that jointly learns protein-pocket structure, ligand geometry, and binding affinity, enabling pocket-conditioned 3D ligand generation with confidence estimation. It deploys a three-stage training pipeline—large-scale pretraining, high-fidelity fine-tuning, and project-specific adaptation—and supports de novo, interaction-guided, and fragment-based generation with inference-time steering via importance sampling to bias toward higher potency. The model achieves state-of-the-art performance in unconditional and pocket-conditioned generation across multiple benchmarks and demonstrates robust affinity predictions for $pIC_{50}$, $pK_i$, $pK_d$, and $pEC_{50}$, while aligning well with quantum-mechanical validation in case studies. This integrated framework offers a practical, adaptable foundation for structure-based drug design, capable of continuous refinement with project data, though it relies on pocket quality and high-fidelity affinity data and benefits from targeted fine-tuning for new SAR landscapes.
Abstract
We present FLOWR:root, an equivariant flow-matching model for pocket-aware 3D ligand generation with joint binding affinity prediction and confidence estimation. The model supports de novo generation, pharmacophore-conditional sampling, fragment elaboration, and multi-endpoint affinity prediction (pIC50, pKi, pKd, pEC50). Training combines large-scale ligand libraries with mixed-fidelity protein-ligand complexes, followed by refinement on curated co-crystal datasets and parameter-efficient finetuning for project-specific adaptation. FLOWR:root achieves state-of-the-art performance in unconditional 3D molecule generation and pocket-conditional ligand design, producing geometrically realistic, low-strain structures. The integrated affinity prediction module demonstrates superior accuracy on the SPINDR test set and outperforms recent models on the Schrodinger FEP+/OpenFE benchmark with substantial speed advantages. As a foundation model, FLOWR:root requires finetuning on project-specific datasets to account for unseen structure-activity landscapes, yielding strong correlation with experimental data. Joint generation and affinity prediction enable inference-time scaling through importance sampling, steering molecular design toward higher-affinity compounds. Case studies validate this: selective CK2$α$ ligand generation against CLK3 shows significant correlation between predicted and quantum-mechanical binding energies, while ER$α$, TYK2 and BACE1 scaffold elaboration demonstrates strong agreement with QM calculations. By integrating structure-aware generation, affinity estimation, and property-guided sampling, FLOWR:root provides a comprehensive foundation for structure-based drug design spanning hit identification through lead optimization.
