Table of Contents
Fetching ...

MC-GNNAS-Dock: Multi-criteria GNN-based Algorithm Selection for Molecular Docking

Siyuan Cao, Hongxuan Wu, Jiabao Brad Wang, Yiliang Yuan, Mustafa Misir

TL;DR

MC-GNNAS-Dock tackles the problem that no single molecular docking algorithm dominates across contexts by introducing a multi-criteria, ranking-aware graph neural network-based algorithm selector. It extends prior GNNAS-Dock with a composite evaluation that jointly optimizes geometric accuracy via RMSD and chemical validity via PoseBusters, and incorporates ranking losses (Pairwise Logistic and NDCG-Loss2) to emphasize correct algorithm ordering. The approach uses a dual-encoder architecture with residual MLPs to predict performance across a diverse portfolio of docking tools, and is trained on a ~3200-complex PDBBind-derived dataset with 10-fold cross-validation. Results show consistent improvements over the single best solver, with gains up to 5.4% on composite criteria and improvements in robustness and scalability, highlighting the practical value of multi-criteria, rank-aware selection for docking in drug discovery.

Abstract

Molecular docking is a core tool in drug discovery for predicting ligand-target interactions. Despite the availability of diverse search-based and machine learning approaches, no single docking algorithm consistently dominates, as performance varies by context. To overcome this challenge, algorithm selection frameworks such as GNNAS-Dock, built on graph neural networks, have been proposed. This study introduces an enhanced system, MC-GNNAS-Dock, with three key advances. First, a multi-criteria evaluation integrates binding-pose accuracy (RMSD) with validity checks from PoseBusters, offering a more rigorous assessment. Second, architectural refinements by inclusion of residual connections strengthen predictive robustness. Third, rank-aware loss functions are incorporated to sharpen rank learning. Extensive experiments are performed on a curated dataset containing approximately 3200 protein-ligand complexes from PDBBind. MC-GNNAS-Dock demonstrates consistently superior performance, achieving up to 5.4% (3.4%) gains under composite criteria of RMSD below 1Å (2Å) with PoseBuster-validity compared to the single best solver (SBS) Uni-Mol Docking V2.

MC-GNNAS-Dock: Multi-criteria GNN-based Algorithm Selection for Molecular Docking

TL;DR

MC-GNNAS-Dock tackles the problem that no single molecular docking algorithm dominates across contexts by introducing a multi-criteria, ranking-aware graph neural network-based algorithm selector. It extends prior GNNAS-Dock with a composite evaluation that jointly optimizes geometric accuracy via RMSD and chemical validity via PoseBusters, and incorporates ranking losses (Pairwise Logistic and NDCG-Loss2) to emphasize correct algorithm ordering. The approach uses a dual-encoder architecture with residual MLPs to predict performance across a diverse portfolio of docking tools, and is trained on a ~3200-complex PDBBind-derived dataset with 10-fold cross-validation. Results show consistent improvements over the single best solver, with gains up to 5.4% on composite criteria and improvements in robustness and scalability, highlighting the practical value of multi-criteria, rank-aware selection for docking in drug discovery.

Abstract

Molecular docking is a core tool in drug discovery for predicting ligand-target interactions. Despite the availability of diverse search-based and machine learning approaches, no single docking algorithm consistently dominates, as performance varies by context. To overcome this challenge, algorithm selection frameworks such as GNNAS-Dock, built on graph neural networks, have been proposed. This study introduces an enhanced system, MC-GNNAS-Dock, with three key advances. First, a multi-criteria evaluation integrates binding-pose accuracy (RMSD) with validity checks from PoseBusters, offering a more rigorous assessment. Second, architectural refinements by inclusion of residual connections strengthen predictive robustness. Third, rank-aware loss functions are incorporated to sharpen rank learning. Extensive experiments are performed on a curated dataset containing approximately 3200 protein-ligand complexes from PDBBind. MC-GNNAS-Dock demonstrates consistently superior performance, achieving up to 5.4% (3.4%) gains under composite criteria of RMSD below 1Å (2Å) with PoseBuster-validity compared to the single best solver (SBS) Uni-Mol Docking V2.

Paper Structure

This paper contains 32 sections, 9 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Selection frequencies under VBS and residual (BCE) (8 algorithms).
  • Figure 2: (%) $\text{RMSD}\leq1\text{\AA}$ & PB-valid