Table of Contents
Fetching ...

Geometric-informed GFlowNets for Structure-Based Drug Design

Grayson Lee, Tony Shen, Martin Ester

TL;DR

This work enhances structure-based drug design by integrating geometry-aware embeddings into GFlowNets for pocket-conditioned molecule generation. By adopting a Trioformer-based architecture that fuses protein, ligand, and pairwise geometric information, the approach yields improved docking affinities on CrossDocked-2020, particularly in single- and multi-objective settings. Despite gains in binding performance, the method shows limited diversity and faces training constraints, motivating future improvements in intra-ligand distance representations and docking-score normalization to better target pocket specificity. Overall, geometry-informed GFlowNets offer a promising path toward more efficient and targeted SBDD exploration of vast chemical spaces.

Abstract

The rise of cost involved with drug discovery and current speed of which they are discover, underscore the need for more efficient structure-based drug design (SBDD) methods. We employ Generative Flow Networks (GFlowNets), to effectively explore the vast combinatorial space of drug-like molecules, which traditional virtual screening methods fail to cover. We introduce a novel modification to the GFlowNet framework by incorporating trigonometrically consistent embeddings, previously utilized in tasks involving protein conformation and protein-ligand interactions, to enhance the model's ability to generate molecules tailored to specific protein pockets. We have modified the existing protein conditioning used by GFlowNets, blending geometric information from both protein and ligand embeddings to achieve more geometrically consistent embeddings. Experiments conducted using CrossDocked2020 demonstrated an improvement in the binding affinity between generated molecules and protein pockets for both single and multi-objective tasks, compared to previous work. Additionally, we propose future work aimed at further increasing the geometric information captured in protein-ligand interactions.

Geometric-informed GFlowNets for Structure-Based Drug Design

TL;DR

This work enhances structure-based drug design by integrating geometry-aware embeddings into GFlowNets for pocket-conditioned molecule generation. By adopting a Trioformer-based architecture that fuses protein, ligand, and pairwise geometric information, the approach yields improved docking affinities on CrossDocked-2020, particularly in single- and multi-objective settings. Despite gains in binding performance, the method shows limited diversity and faces training constraints, motivating future improvements in intra-ligand distance representations and docking-score normalization to better target pocket specificity. Overall, geometry-informed GFlowNets offer a promising path toward more efficient and targeted SBDD exploration of vast chemical spaces.

Abstract

The rise of cost involved with drug discovery and current speed of which they are discover, underscore the need for more efficient structure-based drug design (SBDD) methods. We employ Generative Flow Networks (GFlowNets), to effectively explore the vast combinatorial space of drug-like molecules, which traditional virtual screening methods fail to cover. We introduce a novel modification to the GFlowNet framework by incorporating trigonometrically consistent embeddings, previously utilized in tasks involving protein conformation and protein-ligand interactions, to enhance the model's ability to generate molecules tailored to specific protein pockets. We have modified the existing protein conditioning used by GFlowNets, blending geometric information from both protein and ligand embeddings to achieve more geometrically consistent embeddings. Experiments conducted using CrossDocked2020 demonstrated an improvement in the binding affinity between generated molecules and protein pockets for both single and multi-objective tasks, compared to previous work. Additionally, we propose future work aimed at further increasing the geometric information captured in protein-ligand interactions.
Paper Structure (16 sections, 4 equations, 1 figure, 2 tables)

This paper contains 16 sections, 4 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: We learn nearby attention weights for the pairwise embeddings $h_{ij}^{PL}$ through a two-fold update process first (1.) based on intra-protein distances $d_{jk'}^{P}$ and secondly (2.) on intra-ligand distances $d_{ik}^{L}$. Performing these two updates ensures that we learn attention weights $a_{ijk'}^{(h)}$ for $h_{ij}^{PL}$ which respect both sets of distances, ensuring geometric-consistent information