Table of Contents
Fetching ...

MSCoD: An Enhanced Bayesian Updating Framework with Multi-Scale Information Bottleneck and Cooperative Attention for Structure-Based Drug Design

Long Xu, Yongcai Chen, Fengshuo Liu, Yuzhong Peng

TL;DR

MSCoD addresses core challenges in structure-based drug design by modeling multi-scale protein–ligand interactions with a Bayesian updating framework. It introduces Multi-Scale Information Bottleneck (MSIB) for semantic compression across scales and a Multi-Head Cooperative Attention (MHCA) to capture asymmetric protein-to-ligand interactions within SE(3)-equivariant networks. On CrossDocked2020 and challenging KRAS G12D targets, MSCoD achieves superior binding affinity, drug-likeness, and geometric plausibility, while the MSIB and MHCA modules transfer to drug–target affinity prediction. These results indicate that MSCoD offers a robust, transferable framework for rapid, high-quality structure-based drug design with practical relevance for pharmaceutical discovery.

Abstract

Structure-Based Drug Design (SBDD) is a powerful strategy in computational drug discovery, utilizing three-dimensional protein structures to guide the design of molecules with improved binding affinity. However, capturing complex protein-ligand interactions across multiple scales remains challenging, as current methods often overlook the hierarchical organization and intrinsic asymmetry of these interactions. To address these limitations, we propose MSCoD, a novel Bayesian updating-based generative framework for structure-based drug design. In our MSCoD, Multi-Scale Information Bottleneck (MSIB) was developed, which enables semantic compression at multiple abstraction levels for efficient hierarchical feature extraction. Furthermore, a multi-head cooperative attention (MHCA) mechanism was developed, which employs asymmetric protein-to-ligand attention to capture diverse interaction types while addressing the dimensionality disparity between proteins and ligands. Empirical studies showed that MSCoD outperforms state-of-the-art methods on the benchmark dataset. Its real-world applicability is confirmed by case studies on difficult targets like KRAS G12D (7XKJ). Additionally, the MSIB and MHCA modules prove transferable, boosting the performance of GraphDTA on standard drug target affinity prediction benchmarks (Davis and Kiba). The code and data underlying this article are freely available at https://github.com/xulong0826/MSCoD.

MSCoD: An Enhanced Bayesian Updating Framework with Multi-Scale Information Bottleneck and Cooperative Attention for Structure-Based Drug Design

TL;DR

MSCoD addresses core challenges in structure-based drug design by modeling multi-scale protein–ligand interactions with a Bayesian updating framework. It introduces Multi-Scale Information Bottleneck (MSIB) for semantic compression across scales and a Multi-Head Cooperative Attention (MHCA) to capture asymmetric protein-to-ligand interactions within SE(3)-equivariant networks. On CrossDocked2020 and challenging KRAS G12D targets, MSCoD achieves superior binding affinity, drug-likeness, and geometric plausibility, while the MSIB and MHCA modules transfer to drug–target affinity prediction. These results indicate that MSCoD offers a robust, transferable framework for rapid, high-quality structure-based drug design with practical relevance for pharmaceutical discovery.

Abstract

Structure-Based Drug Design (SBDD) is a powerful strategy in computational drug discovery, utilizing three-dimensional protein structures to guide the design of molecules with improved binding affinity. However, capturing complex protein-ligand interactions across multiple scales remains challenging, as current methods often overlook the hierarchical organization and intrinsic asymmetry of these interactions. To address these limitations, we propose MSCoD, a novel Bayesian updating-based generative framework for structure-based drug design. In our MSCoD, Multi-Scale Information Bottleneck (MSIB) was developed, which enables semantic compression at multiple abstraction levels for efficient hierarchical feature extraction. Furthermore, a multi-head cooperative attention (MHCA) mechanism was developed, which employs asymmetric protein-to-ligand attention to capture diverse interaction types while addressing the dimensionality disparity between proteins and ligands. Empirical studies showed that MSCoD outperforms state-of-the-art methods on the benchmark dataset. Its real-world applicability is confirmed by case studies on difficult targets like KRAS G12D (7XKJ). Additionally, the MSIB and MHCA modules prove transferable, boosting the performance of GraphDTA on standard drug target affinity prediction benchmarks (Davis and Kiba). The code and data underlying this article are freely available at https://github.com/xulong0826/MSCoD.

Paper Structure

This paper contains 23 sections, 15 equations, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: Overview of the MSCoD framework showing the integration of multi-scale information bottleneck and multi-head cooperative attention mechanism for enhanced molecular generation. This method combines Bayesian updating with sophisticated attention mechanisms to model protein-ligand interactions effectively.
  • Figure 2: Detailed architecture of MSCoD modules. (a) Multi-Scale Information Bottleneck (MSIB) modules: illustrates the hierarchical semantic compression and feature fusion process for protein and ligand representations. (b) Multi-Head Cooperative Attention (MHCA) mechanism: shows the asymmetric attention flow from protein to ligand features, the multi-head projection strategy, dynamic gating mechanisms, and the residual processing pipeline. Each attention head specializes in capturing different aspects of protein-ligand interactions, with the final integration producing enhanced ligand representations that better reflect binding site compatibility and molecular optimization requirements.
  • Figure 3: Ten representative ligand molecules generated by MSCoD with excellent QED and SA properties and superior Vina docking scores.
  • Figure 4: Bond length distribution of reference and generated molecules by autoregressive models (upper row) and nonautoregressive models (lower row) for top-5 frequent bond types
  • Figure 5: Bond angle distributions of generated molecules compared with reference molecules.
  • ...and 2 more figures