RIGA-Fold: A General Framework for Protein Inverse Folding via Recurrent Interaction and Geometric Awareness
Sisi Yuan, Jiehuang Chen, Junchuang Cai, Dong Xu, Xueliang Li, Zexuan Zhu, Junkai Ji
TL;DR
RIGA-Fold tackles protein inverse folding by fusing geometry-aware learning with evolutionary priors to overcome local receptive-field limits and single-pass inference. The framework introduces a Geometric Attention Update with Edge-as-Key keys and a Global Context Bridge to address long-range dependencies, and extends to an enhanced RIGA-Fold* that uses dual-stream priors from ESM-2/ESM-IF in a cascaded recycling loop. Empirical results on CATH 4.2, TS50, and TS500 demonstrate strong sequence recovery and structural consistency, with RIGA-Fold* achieving state-of-the-art performance. The approach offers a practical path toward robust de novo protein design by integrating geometric, semantic, and iterative refinement components, albeit with higher inference latency due to recycling.
Abstract
Protein inverse folding, the task of predicting amino acid sequences for desired structures, is pivotal for de novo protein design. However, existing GNN-based methods typically suffer from restricted receptive fields that miss long-range dependencies and a "single-pass" inference paradigm that leads to error accumulation. To address these bottlenecks, we propose RIGA-Fold, a framework that synergizes Recurrent Interaction with Geometric Awareness. At the micro-level, we introduce a Geometric Attention Update (GAU) module where edge features explicitly serve as attention keys, ensuring strictly SE(3)-invariant local encoding. At the macro-level, we design an attention-based Global Context Bridge that acts as a soft gating mechanism to dynamically inject global topological information. Furthermore, to bridge the gap between structural and sequence modalities, we introduce an enhanced variant, RIGA-Fold*, which integrates trainable geometric features with frozen evolutionary priors from ESM-2 and ESM-IF via a dual-stream architecture. Finally, a biologically inspired ``predict-recycle-refine'' strategy is implemented to iteratively denoise sequence distributions. Extensive experiments on CATH 4.2, TS50, and TS500 benchmarks demonstrate that our geometric framework is highly competitive, while RIGA-Fold* significantly outperforms state-of-the-art baselines in both sequence recovery and structural consistency.
