SurfPro: Functional Protein Design Based on Continuous Surface
Zhenqiao Song, Tinglin Huang, Lei Li, Wengong Jin
TL;DR
SurfPro presents a joint geometry–biochemistry surface-to-sequence design framework that generates functional proteins by reasoning over a continuous protein surface. It introduces a hierarchical encoder (local, then global FAMHA) to capture surface shape and biochemical labels, paired with an autoregressive Transformer to predict amino acid sequences. On CATH 4.2 inverse folding and two functional design tasks (binder and enzyme design), SurfPro achieves state-of-the-art recovery and favorable binding/interaction scores, with pretraining on PDB surfaces further boosting performance. The approach enables end-to-end functional protein design from surface cues, offering a scalable pathway for rapid protein engineering and discovery.
Abstract
How can we design proteins with desired functions? We are motivated by a chemical intuition that both geometric structure and biochemical properties are critical to a protein's function. In this paper, we propose SurfPro, a new method to generate functional proteins given a desired surface and its associated biochemical properties. SurfPro comprises a hierarchical encoder that progressively models the geometric shape and biochemical features of a protein surface, and an autoregressive decoder to produce an amino acid sequence. We evaluate SurfPro on a standard inverse folding benchmark CATH 4.2 and two functional protein design tasks: protein binder design and enzyme design. Our SurfPro consistently surpasses previous state-of-the-art inverse folding methods, achieving a recovery rate of 57.78% on CATH 4.2 and higher success rates in terms of protein-protein binding and enzyme-substrate interaction scores.
