MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights

Jingjing Hu; Dan Guo; Zhan Si; Deguang Liu; Yunfeng Diao; Jing Zhang; Jinxing Zhou; Meng Wang

MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights

Jingjing Hu, Dan Guo, Zhan Si, Deguang Liu, Yunfeng Diao, Jing Zhang, Jinxing Zhou, Meng Wang

TL;DR

MOL-Mamba tackles the limitation of omitting electronic information in molecular graph representations by proposing a Mamba-enhanced architecture that fuses structural graphs with electronic descriptors. The framework comprises an Atom & Fragment Mamba-Graph ($\mathcal{G}_A$ MG) for fine-grained structural reasoning and a Mamba-Transformer ($\text{MT}$) fuser to integrate molecular structure with electronic correlations, guided by two training schemes: Structural Distribution Collaborative Training and E-semantic Fusion Training. It introduces a formal problem setup with a dual-graph representation ($\mathcal{G}_A$ and $\mathcal{G}_F$) and descriptors $\mathcal{D}_E$, and optimizes a total loss $\mathcal{L}_{total}=\lambda_d\mathcal{L}_d+\lambda_s\mathcal{L}_s+\lambda_f\mathcal{L}_f+\lambda_{mask}\mathcal{L}_{mask}$. Empirical results across 11 datasets show MOL-Mamba outperforms state-of-the-art baselines on 8 of 11 tasks, with notable performance on BBBP (ROC-AUC $=75.0$) and ESOL (MAE $=0.63$), while maintaining efficiency comparable to GNNs and better parameter efficiency than graph transformers, demonstrating strong potential for drug design and materials discovery.

Abstract

Molecular representation learning plays a crucial role in various downstream tasks, such as molecular property prediction and drug design. To accurately represent molecules, Graph Neural Networks (GNNs) and Graph Transformers (GTs) have shown potential in the realm of self-supervised pretraining. However, existing approaches often overlook the relationship between molecular structure and electronic information, as well as the internal semantic reasoning within molecules. This omission of fundamental chemical knowledge in graph semantics leads to incomplete molecular representations, missing the integration of structural and electronic data. To address these issues, we introduce MOL-Mamba, a framework that enhances molecular representation by combining structural and electronic insights. MOL-Mamba consists of an Atom & Fragment Mamba-Graph (MG) for hierarchical structural reasoning and a Mamba-Transformer (MT) fuser for integrating molecular structure and electronic correlation learning. Additionally, we propose a Structural Distribution Collaborative Training and E-semantic Fusion Training framework to further enhance molecular representation learning. Extensive experiments demonstrate that MOL-Mamba outperforms state-of-the-art baselines across eleven chemical-biological molecular datasets.

MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights

TL;DR

Abstract

MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)