Advancing Drug Discovery with Enhanced Chemical Understanding via Asymmetric Contrastive Multimodal Learning
Yifei Wang, Yunrui Li, Lin Liu, Pengyu Hong, Hao Xu
TL;DR
This work introduces Asymmetric Contrastive Multimodal Learning (ACML) for molecules, enabling cross-modal knowledge transfer from pre-trained chemical modalities into a shallow graph encoder to improve molecular representations for drug discovery. By freezing unimodal encoders (e.g., SMILES, images, NMR, GCMS/LCMS) and training a 5-layer graph encoder with asymmetric contrastive learning, ACML achieves expressive, interpretable embeddings while maintaining training efficiency. Across cross-modality retrieval, isomer discrimination, and molecular-property prediction on MoleculeNet and TDC, ACML demonstrates superior or competitive performance and reveals chemical semantics embedded in graph representations. The results highlight modality-specific strengths, efficient training, and enhanced interpretability, underscoring ACML’s potential to advance AI-driven chemical research and drug discovery.
Abstract
The versatility of multimodal deep learning holds tremendous promise for advancing scientific research and practical applications. As this field continues to evolve, the collective power of cross-modal analysis promises to drive transformative innovations, opening new frontiers in chemical understanding and drug discovery. Hence, we introduce Asymmetric Contrastive Multimodal Learning (ACML), a specifically designed approach to enhance molecular understanding and accelerate advancements in drug discovery. ACML harnesses the power of effective asymmetric contrastive learning to seamlessly transfer information from various chemical modalities to molecular graph representations. By combining pre-trained chemical unimodal encoders and a shallow-designed graph encoder with 5 layers, ACML facilitates the assimilation of coordinated chemical semantics from different modalities, leading to comprehensive representation learning with efficient training. We demonstrate the effectiveness of this framework through large-scale cross-modality retrieval and isomer discrimination tasks. Additionally, ACML enhances interpretability by revealing chemical semantics in graph presentations and bolsters the expressive power of graph neural networks, as evidenced by improved performance in molecular property prediction tasks from MoleculeNet and Therapeutics Data Commons (TDC). Ultimately, ACML exemplifies its potential to revolutionize molecular representational learning, offering deeper insights into the chemical semantics of diverse modalities and paving the way for groundbreaking advancements in chemical research and drug discovery.
