Efficient Training of Transformers for Molecule Property Prediction on Small-scale Datasets
Shivesh Prakash
TL;DR
The paper tackles BBB permeability prediction on small datasets by introducing a GPS Transformer augmented with Self Attention. By tailoring GPS blocks and opting for standard Attention over variants prone to overfitting, the approach achieves a ROC-AUC of $78.8\%$ on the BBBP dataset, outperforming prior methods by $5.5\%$. This demonstrates that effective transformer-based graph models can excel in low-data regimes for molecule property prediction, with practical implications for streamlined CNS drug discovery. The work emphasizes careful architectural choices and data handling (e.g., stratified sampling and augmentation) to maximize performance on limited data, offering a scalable route for BBB-related cheminformatics tasks.
Abstract
The blood-brain barrier (BBB) serves as a protective barrier that separates the brain from the circulatory system, regulating the passage of substances into the central nervous system. Assessing the BBB permeability of potential drugs is crucial for effective drug targeting. However, traditional experimental methods for measuring BBB permeability are challenging and impractical for large-scale screening. Consequently, there is a need to develop computational approaches to predict BBB permeability. This paper proposes a GPS Transformer architecture augmented with Self Attention, designed to perform well in the low-data regime. The proposed approach achieved a state-of-the-art performance on the BBB permeability prediction task using the BBBP dataset, surpassing existing models. With a ROC-AUC of 78.8%, the approach sets a state-of-the-art by 5.5%. We demonstrate that standard Self Attention coupled with GPS transformer performs better than other variants of attention coupled with GPS Transformer.
