Generalist Equivariant Transformer Towards 3D Molecular Interaction Learning
Xiangzhe Kong, Wenbing Huang, Yang Liu
TL;DR
The paper tackles cross-domain 3D molecular interaction learning by proposing a unified geometric graph of sets to represent complexes and a Generalist Equivariant Transformer (GET) that processes matrix-form, variable-size block and atom features with $E(3)$-equivariance. GET combines a bilevel attention mechanism, an equivariant feed-forward network, and equivariant layer normalization to preserve fine-grained geometric information across blocks, enabling simultaneous modeling of intra-block and inter-block interactions. The authors demonstrate superior performance over domain-specific and vanilla unified baselines on protein–protein, protein–ligand, and RNA/DNA–ligand affinity tasks, and show robust cross-domain generalization including zero-shot predictions on unseen domains. These results indicate a pathway toward universal molecular representation learning that leverages shared interaction physics across domains, with practical impact for drug discovery and biomolecular engineering.
Abstract
Many processes in biology and drug discovery involve various 3D interactions between molecules, such as protein and protein, protein and small molecule, etc. Given that different molecules are usually represented in different granularity, existing methods usually encode each type of molecules independently with different models, leaving it defective to learn the various underlying interaction physics. In this paper, we first propose to universally represent an arbitrary 3D complex as a geometric graph of sets, shedding light on encoding all types of molecules with one model. We then propose a Generalist Equivariant Transformer (GET) to effectively capture both domain-specific hierarchies and domain-agnostic interaction physics. To be specific, GET consists of a bilevel attention module, a feed-forward module and a layer normalization module, where each module is E(3) equivariant and specialized for handling sets of variable sizes. Notably, in contrast to conventional pooling-based hierarchical models, our GET is able to retain fine-grained information of all levels. Extensive experiments on the interactions between proteins, small molecules and RNA/DNAs verify the effectiveness and generalization capability of our proposed method across different domains.
