Geometric Multi-color Message Passing Graph Neural Networks for Blood-brain Barrier Permeability Prediction
Trung Nguyen, Md Masud Rana, Farjana Tasnim Mukta, Chang-Guo Zhan, Duc Duy Nguyen
TL;DR
<3-5 sentence high-level summary> The paper tackles BBBP prediction by addressing the limitations of topology-only GNNs that overlook three-dimensional geometry. It introduces GMC-MPNN, a geometry-aware graph neural network that uses weighted colored subgraphs to encode atom-type–specific spatial interactions and long-range effects, integrated with conventional atomic features. Evaluated on three BBBP benchmarks with scaffold-based splits, GMC-MPNN achieves state-of-the-art AUC-ROC and strong regression metrics, demonstrating robust generalization to diverse chemical scaffolds. An ablation study confirms that both common and rare atom-pair motifs contribute meaningfully to predictions, underscoring the value of geometry-informed representations in drug discovery pipelines.
Abstract
Accurate prediction of blood-brain barrier permeability (BBBP) is essential for central nervous system (CNS) drug development. While graph neural networks (GNNs) have advanced molecular property prediction, they often rely on molecular topology and neglect the three-dimensional geometric information crucial for modeling transport mechanisms. This paper introduces the geometric multi-color message-passing graph neural network (GMC-MPNN), a novel framework that enhances standard message-passing architectures by explicitly incorporating atomic-level geometric features and long-range interactions. Our model constructs weighted colored subgraphs based on atom types to capture the spatial relationships and chemical context that govern BBB permeability. We evaluated GMC-MPNN on three benchmark datasets for both classification and regression tasks, using rigorous scaffold-based splitting to ensure a robust assessment of generalization. The results demonstrate that GMC-MPNN consistently outperforms existing state-of-the-art models, achieving superior performance in both classifying compounds as permeable/non-permeable (AUC-ROC of 0.947 and 0.9212) and in regressing continuous permeability values (RMSE of 0.5628, Pearson correlation of 0.6947). An ablation study further quantified the impact of specific atom-pair interactions, revealing that the model's predictive power derives from its ability to learn from both common and rare, but chemically significant, functional motifs. By integrating spatial geometry into the graph representation, GMC-MPNN sets a new performance benchmark and offers a more accurate and generalizable tool for drug discovery pipelines.
