Representational Alignment with Chemical Induced Fit for Molecular Relational Learning

Peiliang Zhang; Jingling Yuan; Qing Xie; Yongjun Zhu; Lin Li

Representational Alignment with Chemical Induced Fit for Molecular Relational Learning

Peiliang Zhang, Jingling Yuan, Qing Xie, Yongjun Zhu, Lin Li

TL;DR

This work targets instability in Molecular Relational Learning caused by attention-based inductive biases that lack chemical-domain guidance. It introduces ReAlignFit, a chemistry-informed framework that combines SRIN for substructure encoding with a Dynamic Representational Alignment Module (DRAM) that incorporates a Bias Correction Function and Subgraph Information Bottleneck to dynamically align core substructures during induced-fit-like interactions. Theoretical analysis links stability to core-confounding substructure separation, and the model optimizes a loss incorporating prediction and calibrated mutual-information terms. Empirical results on nine datasets across MI and DDI tasks show improved predictive performance and, critically, enhanced stability under rule-shifted and scaffold-shifted distributions. This approach demonstrates the practical potential of domain-guided, dynamic representational alignment for robust molecular reasoning.

Abstract

Molecular Relational Learning (MRL) is widely applied in natural sciences to predict relationships between molecular pairs by extracting structural features. The representational similarity between substructure pairs determines the functional compatibility of molecular binding sites. Nevertheless, aligning substructure representations by attention mechanisms lacks guidance from chemical knowledge, resulting in unstable model performance in chemical space (\textit{e.g.}, functional group, scaffold) shifted data. With theoretical justification, we propose the \textbf{Re}presentational \textbf{Align}ment with Chemical Induced \textbf{Fit} (ReAlignFit) to enhance the stability of MRL. ReAlignFit dynamically aligns substructure representation in MRL by introducing chemical Induced Fit-based inductive bias. In the induction process, we design the Bias Correction Function based on substructure edge reconstruction to align representations between substructure pairs by simulating chemical conformational changes (dynamic combination of substructures). ReAlignFit further integrates the Subgraph Information Bottleneck during fit process to refine and optimize substructure pairs exhibiting high chemical functional compatibility, leveraging them to generate molecular embeddings. Experimental results on nine datasets demonstrate that ReAlignFit outperforms state-of-the-art models in two tasks and significantly enhances model's stability in both rule-shifted and scaffold-shifted data distributions.

Representational Alignment with Chemical Induced Fit for Molecular Relational Learning

TL;DR

Abstract

Representational Alignment with Chemical Induced Fit for Molecular Relational Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (4)