Molecular Diffusion Models with Virtual Receptors
Matan Halfon, Eyal Rozenberg, Ehud Rivlin, Daniel Freedman
TL;DR
This work addresses Structure-Based Drug Design (SBDD) with diffusion models by introducing Virtual Receptors to compress large receptor graphs into a smaller, equivariant representation, while incorporating protein language embeddings via ESM to provide residue-level context. The approach enables conditional diffusion over ligand coordinates on a joint ligand–receptor graph, yielding faster training and inference and improved molecular-quality metrics. Experiments on the CrossDocked dataset show the method achieves top performance on multiple drug-likeness and binding metrics, with substantial speedups over standard diffusion and competitive performance relative to autoregressive and normalizing-flow baselines. Overall, the combination of Virtual Receptors and ESM embeddings offers a scalable, high-performance pathway for diffusion-based SBDD molecule generation.
Abstract
Machine learning approaches to Structure-Based Drug Design (SBDD) have proven quite fertile over the last few years. In particular, diffusion-based approaches to SBDD have shown great promise. We present a technique which expands on this diffusion approach in two crucial ways. First, we address the size disparity between the drug molecule and the target/receptor, which makes learning more challenging and inference slower. We do so through the notion of a Virtual Receptor, which is a compressed version of the receptor; it is learned so as to preserve key aspects of the structural information of the original receptor, while respecting the relevant group equivariance. Second, we incorporate a protein language embedding used originally in the context of protein folding. We experimentally demonstrate the contributions of both the virtual receptors and the protein embeddings: in practice, they lead to both better performance, as well as significantly faster computations.
