Table of Contents
Fetching ...

Reaction-conditioned De Novo Enzyme Design with GENzyme

Chenqing Hua, Jiarui Lu, Yong Liu, Odin Zhang, Jian Tang, Rex Ying, Wengong Jin, Guy Wolf, Doina Precup, Shuangjia Zheng

TL;DR

An end-to-end, three-staged model that integrates a catalytic pocket generation and sequence co-design module, a pocket inpainting and enzyme inverse folding module, and a binding and screening module to optimize and predict enzyme-substrate complexes.

Abstract

The introduction of models like RFDiffusionAA, AlphaFold3, AlphaProteo, and Chai1 has revolutionized protein structure modeling and interaction prediction, primarily from a binding perspective, focusing on creating ideal lock-and-key models. However, these methods can fall short for enzyme-substrate interactions, where perfect binding models are rare, and induced fit states are more common. To address this, we shift to a functional perspective for enzyme design, where the enzyme function is defined by the reaction it catalyzes. Here, we introduce \textsc{GENzyme}, a \textit{de novo} enzyme design model that takes a catalytic reaction as input and generates the catalytic pocket, full enzyme structure, and enzyme-substrate binding complex. \textsc{GENzyme} is an end-to-end, three-staged model that integrates (1) a catalytic pocket generation and sequence co-design module, (2) a pocket inpainting and enzyme inverse folding module, and (3) a binding and screening module to optimize and predict enzyme-substrate complexes. The entire design process is driven by the catalytic reaction being targeted. This reaction-first approach allows for more accurate and biologically relevant enzyme design, potentially surpassing structure-based and binding-focused models in creating enzymes capable of catalyzing specific reactions. We provide \textsc{GENzyme} code at https://github.com/WillHua127/GENzyme.

Reaction-conditioned De Novo Enzyme Design with GENzyme

TL;DR

An end-to-end, three-staged model that integrates a catalytic pocket generation and sequence co-design module, a pocket inpainting and enzyme inverse folding module, and a binding and screening module to optimize and predict enzyme-substrate complexes.

Abstract

The introduction of models like RFDiffusionAA, AlphaFold3, AlphaProteo, and Chai1 has revolutionized protein structure modeling and interaction prediction, primarily from a binding perspective, focusing on creating ideal lock-and-key models. However, these methods can fall short for enzyme-substrate interactions, where perfect binding models are rare, and induced fit states are more common. To address this, we shift to a functional perspective for enzyme design, where the enzyme function is defined by the reaction it catalyzes. Here, we introduce \textsc{GENzyme}, a \textit{de novo} enzyme design model that takes a catalytic reaction as input and generates the catalytic pocket, full enzyme structure, and enzyme-substrate binding complex. \textsc{GENzyme} is an end-to-end, three-staged model that integrates (1) a catalytic pocket generation and sequence co-design module, (2) a pocket inpainting and enzyme inverse folding module, and (3) a binding and screening module to optimize and predict enzyme-substrate complexes. The entire design process is driven by the catalytic reaction being targeted. This reaction-first approach allows for more accurate and biologically relevant enzyme design, potentially surpassing structure-based and binding-focused models in creating enzymes capable of catalyzing specific reactions. We provide \textsc{GENzyme} code at https://github.com/WillHua127/GENzyme.

Paper Structure

This paper contains 34 sections, 21 equations, 15 figures, 3 tables.

Figures (15)

  • Figure 1: (A) Enzyme-substrate interaction mechanism. The substrate molecule binds to the enzyme catalytic pocket, where it undergoes catalysis and is converted into product molecules. (B) The overarching goal of GENzyme for de novo enzyme design. Starting from a catalytic reaction, it generates the catalytic pocket, complete enzyme structure, and the enzyme-substrate complex.
  • Figure 2: De novo Enzyme Design with GENzyme. (A) Pocket-substrate and pocket-product complexes generated by GENzyme, with the substrate molecule shown in cyan and the product molecule shown in gray. The positions of catalytic regions, substrate, and product conformations remain mostly unchanged after catalysis. (B) Reaction-conditioned iterative catalytic pocket generation with GENzyme. The catalytic pocket at $t=1$ (consists of $64$ residues) is generated progressively from SE(3) noise initialized at $t=0$. Additional visualizations are available in App. \ref{['app:alphaenzyme.pocket.design']}. (C) Catalytic pocket inpainting with GENzyme, where the generated enzymes achieve high pTM scores, indicating structural quality and alignment with desired functions.
  • Figure 3: (A) Reaction-conditioned enzyme design with GENzyme. In example, the design process is conditioned on reaction OCC(=O)C(=O)O $\Rightarrow$ OC[C@H](C(=O)O)O. GENzyme designs enzymes by first ① generating catalytic pockets, and ② co-designing pocket sequence. Next, it ③ inpaints the catalytic pocket to complete the full enzyme structure with ④ the generation of enzyme sequence. Finally, ⑤ the substrate conformation binds to the catalytic pocket of the full enzyme, and the ideal lock-and-key enzyme-substrate complex is predicted. (B) GENzyme Pre-training Phase, in which each module is trained individually on catalytic pockets and full enzyme structures. (C) GENzyme Fine-tuning Phase, where the model undergoes end-to-end training with slight perturbations applied to input pockets and enzymes.
  • Figure 4: GENzymede novo Enzyme Design Example.
  • Figure 5: t-SNE visualization of catalytic pocket embeddings generated by ESM3. Red denotes ground-truth catalytic pockets, yellow denotes RFDiffusionAA-generated pockets, blue denotes GENzyme-generated pockets, and green denotes EnzymeFlow-generated pockets.
  • ...and 10 more figures