PharmacoNet: Accelerating Large-Scale Virtual Screening by Deep Pharmacophore Modeling
Seonghwan Seo, Woo Youn Kim
TL;DR
PharmacoNet addresses the need for scalable, structure-based pre-screening by introducing a deep-learning framework that (1) performs automated protein-based pharmacophore modeling via instance segmentation, and (2) uses a coarse-grained graph-matching approach to predict ligand binding poses at the pharmacophore level with a distance-likelihood scoring function. Ground-truth pharmacophore information is derived from complex-based data in PDBBind with PLIP-handled NCIs across seven pharmacophore types, enabling automated hotspot and pharmacophore generation. In benchmark and large-scale pre-screening experiments, PharmacoNet achieves orders-of-magnitude speedups over traditional docking methods while maintaining competitive enrichment (EF) and AUROC, and demonstrates strong generalization under reduced training data. Limitations include the absence of atomic-level energetics, suggesting future integration with force-field terms or atomistic ML into graph matching and scoring to further enhance accuracy. Overall, PharmacoNet reveals the untapped potential of deep pharmacophore modeling for fast, generalizable, structure-based drug discovery at scale.
Abstract
As the size of accessible compound libraries expands to over 10 billion, the need for more efficient structure-based virtual screening methods is emerging. Different pre-screening methods have been developed for rapid screening, but there is still a lack of structure-based methods applicable to various proteins that perform protein-ligand binding conformation prediction and scoring in an extremely short time. Here, we describe for the first time a deep-learning framework for structure-based pharmacophore modeling to address this challenge. We frame pharmacophore modeling as an instance segmentation problem to determine each protein hotspot and the location of corresponding pharmacophores, and protein-ligand binding pose prediction as a graph-matching problem. PharmacoNet is significantly faster than state-of-the-art structure-based approaches, yet reasonably accurate with a simple scoring function. Furthermore, we show the promising result that PharmacoNet effectively retains hit candidates even under the high pre-screening filtration rates. Overall, our study uncovers the hitherto untapped potential of a pharmacophore modeling approach in deep learning-based drug discovery.
