Table of Contents
Fetching ...

UniIF: Unified Molecule Inverse Folding

Zhangyang Gao, Jue Wang, Cheng Tan, Lirong Wu, Yufei Huang, Siyuan Li, Zhirui Ye, Stan Z. Li

TL;DR

UniIF addresses the general molecule inverse folding problem by unifying representations across proteins, RNAs, and materials into a frame-based block graph. It introduces a geometric featurizer and a Block Graph Attention Module that leverages long-range dependencies via virtual blocks while preserving efficiency with sparse connections. Across protein design, RNA design, and material design, UniIF achieves state-of-the-art or competitive results and demonstrates strong generalization in time-split setups. This unified approach offers a versatile framework for rapid design in drug discovery and materials science, reducing redundant efforts across molecule types.

Abstract

Molecule inverse folding has been a long-standing challenge in chemistry and biology, with the potential to revolutionize drug discovery and material science. Despite specified models have been proposed for different small- or macro-molecules, few have attempted to unify the learning process, resulting in redundant efforts. Complementary to recent advancements in molecular structure prediction, such as RoseTTAFold All-Atom and AlphaFold3, we propose the unified model UniIF for the inverse folding of all molecules. We do such unification in two levels: 1) Data-Level: We propose a unified block graph data form for all molecules, including the local frame building and geometric feature initialization. 2) Model-Level: We introduce a geometric block attention network, comprising a geometric interaction, interactive attention and virtual long-term dependency modules, to capture the 3D interactions of all molecules. Through comprehensive evaluations across various tasks such as protein design, RNA design, and material design, we demonstrate that our proposed method surpasses state-of-the-art methods on all tasks. UniIF offers a versatile and effective solution for general molecule inverse folding.

UniIF: Unified Molecule Inverse Folding

TL;DR

UniIF addresses the general molecule inverse folding problem by unifying representations across proteins, RNAs, and materials into a frame-based block graph. It introduces a geometric featurizer and a Block Graph Attention Module that leverages long-range dependencies via virtual blocks while preserving efficiency with sparse connections. Across protein design, RNA design, and material design, UniIF achieves state-of-the-art or competitive results and demonstrates strong generalization in time-split setups. This unified approach offers a versatile framework for rapid design in drug discovery and materials science, reducing redundant efforts across molecule types.

Abstract

Molecule inverse folding has been a long-standing challenge in chemistry and biology, with the potential to revolutionize drug discovery and material science. Despite specified models have been proposed for different small- or macro-molecules, few have attempted to unify the learning process, resulting in redundant efforts. Complementary to recent advancements in molecular structure prediction, such as RoseTTAFold All-Atom and AlphaFold3, we propose the unified model UniIF for the inverse folding of all molecules. We do such unification in two levels: 1) Data-Level: We propose a unified block graph data form for all molecules, including the local frame building and geometric feature initialization. 2) Model-Level: We introduce a geometric block attention network, comprising a geometric interaction, interactive attention and virtual long-term dependency modules, to capture the 3D interactions of all molecules. Through comprehensive evaluations across various tasks such as protein design, RNA design, and material design, we demonstrate that our proposed method surpasses state-of-the-art methods on all tasks. UniIF offers a versatile and effective solution for general molecule inverse folding.
Paper Structure (40 sections, 11 equations, 7 figures, 3 tables)

This paper contains 40 sections, 11 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Unified molecule inverse folding.
  • Figure 2: The Overall framework. (1) The model treat all types of molecules as block graphs. For macromolecules, we use predefined frames based on amino acids and nucleotides; for small molecules, we learn the local frame of each block by one-layer GNN. (2) A geometric featurizer is used to initialize the geometric node feature and edge features. (3) We propose the block graph attention layer, based on which we build the block graph neural network to learn expressive block representations. (4) Finally, we show that the UniIF can achieve competitive results on diverse tasks, ranging from protein design, RNA design and material design.
  • Figure 3: Blocks of different molecules. The basic building blocks include amino acids, nucleotides and atoms.
  • Figure 4: Unified molecule inverse folding.
  • Figure 5: Block Graph Attention Module. (a) Virtual Block for Long-term Dependencies. (b) Geometric Interaction Extractor for learning pairwise features. (c) Gated Edge Attention for updating node features.
  • ...and 2 more figures