Large Language-Geometry Model: When LLM meets Equivariance
Zongzhao Li, Jiacheng Cen, Bing Su, Wenbing Huang, Tingyang Xu, Yu Rong, Deli Zhao
TL;DR
This work tackles predicting 3D structures and dynamics while preserving $\mathrm{E}(3)$-equivariance by fusing Large Language Models with geometry-aware, equivariant graph processing in EquiLLM. The framework uses geometry-aware prompting to guide a frozen LLM as an invariant feature processor, while a lightweight Equivariant Encoder and Adaptor handle all directional 3D reasoning, ensuring $\mathrm{E}(3)$-equivariance throughout. Key contributions include a three-component architecture, task-specific geometry-aware prompts (task description, object features, statistics), and comprehensive validation on molecular dynamics (MD17), human motion, and antibody design (RAbD), achieving state-of-the-art results across several metrics. The approach demonstrates strong knowledge integration and generalizability for 3D physical tasks, reducing training costs by avoiding fine-tuning of the LLM and enabling broader scientific applications through modular design.
Abstract
Accurately predicting 3D structures and dynamics of physical systems is crucial in scientific applications. Existing approaches that rely on geometric Graph Neural Networks (GNNs) effectively enforce $\mathrm{E}(3)$-equivariance, but they often fall in leveraging extensive broader information. While direct application of Large Language Models (LLMs) can incorporate external knowledge, they lack the capability for spatial reasoning with guaranteed equivariance. In this paper, we propose EquiLLM, a novel framework for representing 3D physical systems that seamlessly integrates E(3)-equivariance with LLM capabilities. Specifically, EquiLLM comprises four key components: geometry-aware prompting, an equivariant encoder, an LLM, and an equivariant adaptor. Essentially, the LLM guided by the instructive prompt serves as a sophisticated invariant feature processor, while 3D directional information is exclusively handled by the equivariant encoder and adaptor modules. Experimental results demonstrate that EquiLLM delivers significant improvements over previous methods across molecular dynamics simulation, human motion simulation, and antibody design, highlighting its promising generalizability.
