Structure-Aware Contrastive Learning with Fine-Grained Binding Representations for Drug Discovery
Jing Lan, Hexiao Ding, Hongzhao Chen, Yufeng Jiang, Nga-Chun Ng, Gwing Kei Yip, Gerald W. Y. Cheng, Yunlin Mao, Jing Cai, Liang-ting Lin, Jung Sun Yoo
TL;DR
The paper tackles scalable drug–target interaction (DTI) prediction by embedding structural priors into sequence-based representations. It introduces a structure-aware protein vocabulary, SELFIES-based drug encoding, a patch-based attention trunk, and a contrastive cross-modal objective paired with a bilinear attention module to model interactions. The approach achieves state-of-the-art results on the Human and BioSNAP DTI benchmarks, remains competitive on BindingDB, and delivers superior virtual screening performance on LIT-PCBA (AUROC 68.16% and BEDROC 13.35%). Ablation and visualization analyses confirm the importance of learned aggregation, contrastive alignment, and interpretable attention over ligand–residue contacts, highlighting the method’s robustness and interpretability for scalable, structure-informed DTI pre-screening.
Abstract
Accurate identification of drug-target interactions (DTI) remains a central challenge in computational pharmacology, where sequence-based methods offer scalability. This work introduces a sequence-based drug-target interaction framework that integrates structural priors into protein representations while maintaining high-throughput screening capability. Evaluated across multiple benchmarks, the model achieves state-of-the-art performance on Human and BioSNAP datasets and remains competitive on BindingDB. In virtual screening tasks, it surpasses prior methods on LIT-PCBA, yielding substantial gains in AUROC and BEDROC. Ablation studies confirm the critical role of learned aggregation, bilinear attention, and contrastive alignment in enhancing predictive robustness. Embedding visualizations reveal improved spatial correspondence with known binding pockets and highlight interpretable attention patterns over ligand-residue contacts. These results validate the framework's utility for scalable and structure-aware DTI prediction.
