A Generalist Cross-Domain Molecular Learning Framework for Structure-Based Drug Discovery
Yiheng Zhu, Mingyang Li, Junlong Liu, Kun Fu, Jiansheng Wu, Qiuyi Li, Mingze Yin, Jieping Ye, Jian Wu, Zheng Wang
TL;DR
The paper introduces BIT, a generalist cross-domain molecular learning framework that unifies small molecules, proteins, and protein–ligand complexes within a single Transformer backbone. Via Mixture-of-Domain-Experts (MoDE) and Mixture-of-Structure-Experts (MoSE), BIT achieves domain-specific encoding while capturing cross-domain interactions, trained with unified coordinate and token denoising objectives on 2D and 3D data. Empirical results across binding affinity prediction, structure-based virtual screening, and molecular property prediction demonstrate state-of-the-art performance and favorable inference efficiency, with real-world validation in NMDA receptor screening. The work highlights BIT’s potential to accelerate structure-based drug discovery by enabling cross-domain representation learning, scalable pre-training, and versatile fine-tuning for diverse SBDD tasks.
Abstract
Structure-based drug discovery (SBDD) is a systematic scientific process that develops new drugs by leveraging the detailed physical structure of the target protein. Recent advancements in pre-trained models for biomolecules have demonstrated remarkable success across various biochemical applications, including drug discovery and protein engineering. However, in most approaches, the pre-trained models primarily focus on the characteristics of either small molecules or proteins, without delving into their binding interactions which are essential cross-domain relationships pivotal to SBDD. To fill this gap, we propose a general-purpose foundation model named BIT (an abbreviation for Biomolecular Interaction Transformer), which is capable of encoding a range of biochemical entities, including small molecules, proteins, and protein-ligand complexes, as well as various data formats, encompassing both 2D and 3D structures. Specifically, we introduce Mixture-of-Domain-Experts (MoDE) to handle the biomolecules from diverse biochemical domains and Mixture-of-Structure-Experts (MoSE) to capture positional dependencies in the molecular structures. The proposed mixture-of-experts approach enables BIT to achieve both deep fusion and domain-specific encoding, effectively capturing fine-grained molecular interactions within protein-ligand complexes. Then, we perform cross-domain pre-training on the shared Transformer backbone via several unified self-supervised denoising tasks. Experimental results on various benchmarks demonstrate that BIT achieves exceptional performance in downstream tasks, including binding affinity prediction, structure-based virtual screening, and molecular property prediction.
