Table of Contents
Fetching ...

EnzyControl: Adding Functional and Substrate-Specific Control for Enzyme Backbone Generation

Chao Song, Zhiyuan Liu, Han Huang, Liang Wang, Qiong Wang, Jianyu Shi, Hui Yu, Yihang Zhou, Yang Zhang

TL;DR

EnzyControl tackles substrate-specific enzyme backbone design by integrating evolutionarily conserved functional motifs learned via multiple sequence alignment with substrate-aware conditioning. It builds on a motif-scaffolding base network and introduces EnzyAdapter, a lightweight cross-attention module, trained via a two-stage regime using LoRA to achieve stable, efficient learning. A curated EnzyBind dataset of $11{,}100$ enzyme–substrate pairs enables rigorous structural and functional benchmarking, where EnzyControl attains state-of-the-art designability ($0.7160$), EC match rate ($0.504$), and notable $k_{cat}$ gains, plus improvements in binding metrics on EnzyBench. Zero-shot experiments and residue-length analysis demonstrate robust generalization and practical design benefits, highlighting the method’s potential for substrate-tuned enzyme engineering, while acknowledging backbone-only limitations and pointing to docking- and multimeric extensions as future directions.

Abstract

Designing enzyme backbones with substrate-specific functionality is a critical challenge in computational protein engineering. Current generative models excel in protein design but face limitations in binding data, substrate-specific control, and flexibility for de novo enzyme backbone generation. To address this, we introduce EnzyBind, a dataset with 11,100 experimentally validated enzyme-substrate pairs specifically curated from PDBbind. Building on this, we propose EnzyControl, a method that enables functional and substrate-specific control in enzyme backbone generation. Our approach generates enzyme backbones conditioned on MSA-annotated catalytic sites and their corresponding substrates, which are automatically extracted from curated enzyme-substrate data. At the core of EnzyControl is EnzyAdapter, a lightweight, modular component integrated into a pretrained motif-scaffolding model, allowing it to become substrate-aware. A two-stage training paradigm further refines the model's ability to generate accurate and functional enzyme structures. Experiments show that our EnzyControl achieves the best performance across structural and functional metrics on EnzyBind and EnzyBench benchmarks, with particularly notable improvements of 13\% in designability and 13\% in catalytic efficiency compared to the baseline models. The code is released at https://github.com/Vecteur-libre/EnzyControl.

EnzyControl: Adding Functional and Substrate-Specific Control for Enzyme Backbone Generation

TL;DR

EnzyControl tackles substrate-specific enzyme backbone design by integrating evolutionarily conserved functional motifs learned via multiple sequence alignment with substrate-aware conditioning. It builds on a motif-scaffolding base network and introduces EnzyAdapter, a lightweight cross-attention module, trained via a two-stage regime using LoRA to achieve stable, efficient learning. A curated EnzyBind dataset of enzyme–substrate pairs enables rigorous structural and functional benchmarking, where EnzyControl attains state-of-the-art designability (), EC match rate (), and notable gains, plus improvements in binding metrics on EnzyBench. Zero-shot experiments and residue-length analysis demonstrate robust generalization and practical design benefits, highlighting the method’s potential for substrate-tuned enzyme engineering, while acknowledging backbone-only limitations and pointing to docking- and multimeric extensions as future directions.

Abstract

Designing enzyme backbones with substrate-specific functionality is a critical challenge in computational protein engineering. Current generative models excel in protein design but face limitations in binding data, substrate-specific control, and flexibility for de novo enzyme backbone generation. To address this, we introduce EnzyBind, a dataset with 11,100 experimentally validated enzyme-substrate pairs specifically curated from PDBbind. Building on this, we propose EnzyControl, a method that enables functional and substrate-specific control in enzyme backbone generation. Our approach generates enzyme backbones conditioned on MSA-annotated catalytic sites and their corresponding substrates, which are automatically extracted from curated enzyme-substrate data. At the core of EnzyControl is EnzyAdapter, a lightweight, modular component integrated into a pretrained motif-scaffolding model, allowing it to become substrate-aware. A two-stage training paradigm further refines the model's ability to generate accurate and functional enzyme structures. Experiments show that our EnzyControl achieves the best performance across structural and functional metrics on EnzyBind and EnzyBench benchmarks, with particularly notable improvements of 13\% in designability and 13\% in catalytic efficiency compared to the baseline models. The code is released at https://github.com/Vecteur-libre/EnzyControl.

Paper Structure

This paper contains 28 sections, 8 equations, 16 figures, 11 tables, 1 algorithm.

Figures (16)

  • Figure 1: Dataset collection and preprocessing.
  • Figure 2: EC distribution.
  • Figure 3: EnzyControl is a flexible approach for the conditional backbone generation of enzymes. (a) Feature Initialization involves obtaining initial node embeddings and edge embeddings, constructing initial frames for enzyme backbones, and initializing pretrained features for substrates. (b) Single-layer structure prediction network with EnzyAdapter.
  • Figure 4: Two-stage training paradigm.
  • Figure 5: Comparison of $k$cat distribution.
  • ...and 11 more figures