FLOWR: Flow Matching for Structure-Aware De Novo, Interaction- and Fragment-Based Ligand Generation
Julian Cremer, Ross Irwin, Alessandro Tibo, Jon Paul Janet, Simon Olsson, Djork-Arné Clevert
TL;DR
FLOWR introduces a structure-aware, flow-macring approach for de novo 3D ligand generation conditioned on protein pockets, achieving up to $70$-fold faster inference than diffusion-based methods. It combines continuous and discrete flow matching with an equivariant transport mechanism and a pocket encoder to capture geometric and chemical interactions efficiently. To address data quality and leakage in benchmarks, the authors present Spindr, a large, refined ligand–pocket dataset with rigorous preprocessing and interaction labeling. Flowr.multi extends FLOWR to interaction- and fragment-based conditioning, enabling targeted design around predefined interaction profiles and chemical substructures without retraining, and is demonstrated on targets including 5YEA and 4MPE. Together, Flowr, Flowr.multi, and Spindr establish a robust framework and benchmark for scalable, realistic structure-based drug design with strong pose accuracy, interaction recovery, and practical utility for hit expansion and fragment-based design.
Abstract
We introduce FLOWR, a novel structure-based framework for the generation and optimization of three-dimensional ligands. FLOWR integrates continuous and categorical flow matching with equivariant optimal transport, enhanced by an efficient protein pocket conditioning. Alongside FLOWR, we present SPINDR, a thoroughly curated dataset comprising ligand-pocket co-crystal complexes specifically designed to address existing data quality issues. Empirical evaluations demonstrate that FLOWR surpasses current state-of-the-art diffusion- and flow-based methods in terms of PoseBusters-validity, pose accuracy, and interaction recovery, while offering a significant inference speedup, achieving up to 70-fold faster performance. In addition, we introduce FLOWR:multi, a highly accurate multi-purpose model allowing for the targeted sampling of novel ligands that adhere to predefined interaction profiles and chemical substructures for fragment-based design without the need of re-training or any re-sampling strategies
