Table of Contents
Fetching ...

FLOWR: Flow Matching for Structure-Aware De Novo, Interaction- and Fragment-Based Ligand Generation

Julian Cremer, Ross Irwin, Alessandro Tibo, Jon Paul Janet, Simon Olsson, Djork-Arné Clevert

TL;DR

FLOWR introduces a structure-aware, flow-macring approach for de novo 3D ligand generation conditioned on protein pockets, achieving up to $70$-fold faster inference than diffusion-based methods. It combines continuous and discrete flow matching with an equivariant transport mechanism and a pocket encoder to capture geometric and chemical interactions efficiently. To address data quality and leakage in benchmarks, the authors present Spindr, a large, refined ligand–pocket dataset with rigorous preprocessing and interaction labeling. Flowr.multi extends FLOWR to interaction- and fragment-based conditioning, enabling targeted design around predefined interaction profiles and chemical substructures without retraining, and is demonstrated on targets including 5YEA and 4MPE. Together, Flowr, Flowr.multi, and Spindr establish a robust framework and benchmark for scalable, realistic structure-based drug design with strong pose accuracy, interaction recovery, and practical utility for hit expansion and fragment-based design.

Abstract

We introduce FLOWR, a novel structure-based framework for the generation and optimization of three-dimensional ligands. FLOWR integrates continuous and categorical flow matching with equivariant optimal transport, enhanced by an efficient protein pocket conditioning. Alongside FLOWR, we present SPINDR, a thoroughly curated dataset comprising ligand-pocket co-crystal complexes specifically designed to address existing data quality issues. Empirical evaluations demonstrate that FLOWR surpasses current state-of-the-art diffusion- and flow-based methods in terms of PoseBusters-validity, pose accuracy, and interaction recovery, while offering a significant inference speedup, achieving up to 70-fold faster performance. In addition, we introduce FLOWR:multi, a highly accurate multi-purpose model allowing for the targeted sampling of novel ligands that adhere to predefined interaction profiles and chemical substructures for fragment-based design without the need of re-training or any re-sampling strategies

FLOWR: Flow Matching for Structure-Aware De Novo, Interaction- and Fragment-Based Ligand Generation

TL;DR

FLOWR introduces a structure-aware, flow-macring approach for de novo 3D ligand generation conditioned on protein pockets, achieving up to -fold faster inference than diffusion-based methods. It combines continuous and discrete flow matching with an equivariant transport mechanism and a pocket encoder to capture geometric and chemical interactions efficiently. To address data quality and leakage in benchmarks, the authors present Spindr, a large, refined ligand–pocket dataset with rigorous preprocessing and interaction labeling. Flowr.multi extends FLOWR to interaction- and fragment-based conditioning, enabling targeted design around predefined interaction profiles and chemical substructures without retraining, and is demonstrated on targets including 5YEA and 4MPE. Together, Flowr, Flowr.multi, and Spindr establish a robust framework and benchmark for scalable, realistic structure-based drug design with strong pose accuracy, interaction recovery, and practical utility for hit expansion and fragment-based design.

Abstract

We introduce FLOWR, a novel structure-based framework for the generation and optimization of three-dimensional ligands. FLOWR integrates continuous and categorical flow matching with equivariant optimal transport, enhanced by an efficient protein pocket conditioning. Alongside FLOWR, we present SPINDR, a thoroughly curated dataset comprising ligand-pocket co-crystal complexes specifically designed to address existing data quality issues. Empirical evaluations demonstrate that FLOWR surpasses current state-of-the-art diffusion- and flow-based methods in terms of PoseBusters-validity, pose accuracy, and interaction recovery, while offering a significant inference speedup, achieving up to 70-fold faster performance. In addition, we introduce FLOWR:multi, a highly accurate multi-purpose model allowing for the targeted sampling of novel ligands that adhere to predefined interaction profiles and chemical substructures for fragment-based design without the need of re-training or any re-sampling strategies

Paper Structure

This paper contains 25 sections, 4 equations, 22 figures, 8 tables.

Figures (22)

  • Figure 1: Overview of Flowr. Schematical overview of the Flowr model for 3D ligand generation. A protein pocket is encoded and passed, along with the noisy ligand $l_t$, into the ligand decoder, which is trained to produce a denoised ligand $\tilde{l}_t$. Optionally, a set of desired pocket-ligand features can be incorporated. A mixed continuous and categorical flow matching integration scheme is then used to push $l_t$ towards the data distribution and generate a sample $\tilde{l}_1$. The Flowr model takes as input pocket coordinates along with atom, bond, and residue types, as well as ligand coordinates (with added noise), atom types, and bond types. Pocket features are processed through $L_{enc}$ sequential blocks consisting of equivariant self-attention and equivariant feed-forward layers, resulting in a pocket encoding. This pocket encoding is subsequently integrated via equivariant cross-attention into $L_{dec}$ blocks of equivariant self-attention that process ligand features. Finally, Flowr outputs denoised ligand coordinates, atom types, bond types, and charges. During inference, the pocket encoding is computed only once and reused for all ligand generation steps.
  • Figure 2: Comparison of Pilot and Flowr on validity and inference speed on Spindr. We compare Pilot and Flowr in terms of RDKit- and PoseBusters-validity (left) and inference speed (right, log scale). Results for Flowr are reported using three different inference step settings: 20, 50, and 100 steps. For each of the 225 targets in the Spindr test set, we generate 100 ligands and compute the average validity scores and inference time per target. Note, both RDKit- and PoseBusters-validity are evaluated on the full set of generated ligands per target. Both models are evaluated using a single Nvidia H100 GPU.
  • Figure 3: Comparison of Pilot and Flowr on molecular properties on Spindr. We compare Pilot and Flowr in terms of strain energy (kcal/mol) and relaxation energy ($\Delta E_{\text{relax}}^{\text{xTB}}$ - kcal/mol) using GFN2-xTB with implicit solvation using the ALPB solvation model (top), and on logP, TPSA, number of aromatic rings and SA score (bottom). For each of the 225 targets in the Spindr test set, we generate 100 ligands and compare the resulting distributions of both models. Red dots/lines highlight the respective mean values.
  • Figure 4: Comparison of Pilot and Flowr on AutoDock-Vina scores on Spindr. We compare Pilot and Flowr in terms of Vina scores. For each of the 225 targets in the Spindr test set, we generate 100 ligands and compare the resulting distributions of both models (left) and the mean Vina scores per target (right). Success rate denotes the mean number of ligands per target that outperform the respective reference ligand in terms of Vina score. Red dots/lines highlight the respective mean values.
  • Figure 5: Comparison of Pilot and Flowr on interaction recovery on Spindr. We compare Pilot and Flowr in terms of interaction recovery rates. Both models are either trained without explicit hydrogens (no-Hs) or with explicit hydrogens (with-Hs). The success rate is the percentage of ligands for which interaction fingerprints could be retrieved for 100 sampled ligands for every test set target. For calculating the interaction fingerprints we used ProLIF.
  • ...and 17 more figures