E2CB2former: Effecitve and Explainable Transformer for CB2 Receptor Ligand Activity Prediction
Jiacheng Xie, Yingrui Ji, Linghuan Zeng, Xi Xiao, Gaofei Chen, Lijing Zhu, Joyanta Jyoti Mondal, Jiansheng Chen
TL;DR
CB2former addresses CB2 receptor ligand activity prediction by fusing a Graph Convolutional Network with a Transformer and injecting receptor-specific knowledge via unlearnable prompt tokens. The model leverages SMILES sequences and Morgan fingerprints, with self-attention revealing receptor-critical substructures, delivering both accuracy ($R^2 = 0.685$, RMSE = 0.675, $AUC = 0.940$) and interpretability. Across benchmarks and cross-validation, CB2former outperforms a wide range of baselines, demonstrating robust generalization and providing actionable insights for virtual screening and lead optimization. This work highlights the potential of explainable AI in cheminformatics to accelerate CB2 drug discovery while outlining avenues for dataset expansion and pretraining-based enhancements.
Abstract
Accurate prediction of CB2 receptor ligand activity is pivotal for advancing drug discovery targeting this receptor, which is implicated in inflammation, pain management, and neurodegenerative conditions. Although conventional machine learning and deep learning techniques have shown promise, their limited interpretability remains a significant barrier to rational drug design. In this work, we introduce CB2former, a framework that combines a Graph Convolutional Network with a Transformer architecture to predict CB2 receptor ligand activity. By leveraging the Transformer's self attention mechanism alongside the GCN's structural learning capability, CB2former not only enhances predictive performance but also offers insights into the molecular features underlying receptor activity. We benchmark CB2former against diverse baseline models including Random Forest, Support Vector Machine, K Nearest Neighbors, Gradient Boosting, Extreme Gradient Boosting, Multilayer Perceptron, Convolutional Neural Network, and Recurrent Neural Network and demonstrate its superior performance with an R squared of 0.685, an RMSE of 0.675, and an AUC of 0.940. Moreover, attention weight analysis reveals key molecular substructures influencing CB2 receptor activity, underscoring the model's potential as an interpretable AI tool for drug discovery. This ability to pinpoint critical molecular motifs can streamline virtual screening, guide lead optimization, and expedite therapeutic development. Overall, our results showcase the transformative potential of advanced AI approaches exemplified by CB2former in delivering both accurate predictions and actionable molecular insights, thus fostering interdisciplinary collaboration and innovation in drug discovery.
