Table of Contents
Fetching ...

ProteinWeaver: A Divide-and-Assembly Approach for Protein Backbone Design

Yiming Ma, Fei Ye, Yi Zhou, Zaixiang Zheng, Dongyu Xue, Quanquan Gu

TL;DR

ProteinWeaver, a two-stage framework for protein backbone design that first generates individual protein domains and then employs an SE(3) diffusion model to flexibly assemble these domains, advances protein engineering and opens new avenues for functional protein design.

Abstract

Nature creates diverse proteins through a 'divide and assembly' strategy. Inspired by this idea, we introduce ProteinWeaver, a two-stage framework for protein backbone design. Our method first generates individual protein domains and then employs an SE(3) diffusion model to flexibly assemble these domains. A key challenge lies in the assembling step, given the complex and rugged nature of the inter-domain interaction landscape. To address this challenge, we employ preference alignment to discern complex relationships between structure and interaction landscapes through comparative analysis of generated samples. Comprehensive experiments demonstrate that ProteinWeaver: (1) generates high-quality, novel protein backbones through versatile domain assembly; (2) outperforms RFdiffusion, the current state-of-the-art in backbone design, by 13\% and 39\% for long-chain proteins; (3) shows the potential for cooperative function design through illustrative case studies. To sum up, by introducing a `divide-and-assembly' paradigm, ProteinWeaver advances protein engineering and opens new avenues for functional protein design.

ProteinWeaver: A Divide-and-Assembly Approach for Protein Backbone Design

TL;DR

ProteinWeaver, a two-stage framework for protein backbone design that first generates individual protein domains and then employs an SE(3) diffusion model to flexibly assemble these domains, advances protein engineering and opens new avenues for functional protein design.

Abstract

Nature creates diverse proteins through a 'divide and assembly' strategy. Inspired by this idea, we introduce ProteinWeaver, a two-stage framework for protein backbone design. Our method first generates individual protein domains and then employs an SE(3) diffusion model to flexibly assemble these domains. A key challenge lies in the assembling step, given the complex and rugged nature of the inter-domain interaction landscape. To address this challenge, we employ preference alignment to discern complex relationships between structure and interaction landscapes through comparative analysis of generated samples. Comprehensive experiments demonstrate that ProteinWeaver: (1) generates high-quality, novel protein backbones through versatile domain assembly; (2) outperforms RFdiffusion, the current state-of-the-art in backbone design, by 13\% and 39\% for long-chain proteins; (3) shows the potential for cooperative function design through illustrative case studies. To sum up, by introducing a `divide-and-assembly' paradigm, ProteinWeaver advances protein engineering and opens new avenues for functional protein design.

Paper Structure

This paper contains 60 sections, 17 equations, 12 figures, 8 tables, 1 algorithm.

Figures (12)

  • Figure 1: Overview of ProteinWeaver. (A) An illustration demonstrating the 'divide-and-assembly' approach to native protein evolution, which enhances cooperative function design. The pictures are adapted from this study aziz2021evolution. (B) ProteinWeaver emulates natural strategies to create protein backbones. (C) ProteinWeaver is a backbone diffusion model. (D) The inter-domain structure-interaction landscape is complex and rugged, where minor structural modifications can lead to significant changes in interactions. Preference alignment technique aids in navigating this landscape effectively. (E) Existing methods struggle with long-chain backbone design, whereas ProteinWeaver demonstrates a considerable advantage. (F) A radar chart illustrates ProteinWeaver's overall performance in long-sequence backbone design. Inter-domain quality is evaluated using interface scTM metrics.
  • Figure 2: ProteinWeaver employs a two-staged 'divide-and-Assembly' framework, first generating individual protein domains and then using an SE(3) diffusion model to flexibly assemble these domains. $\bar{\mathbf{S}}$ represents isolated domains undergoing internal structural modifications for assembly into integrated backbones.
  • Figure 3: ProteinWeaver enables high-quality backbone design by assembling domains from diverse sources. (A) Backbone and interface quality estimation of native domain assembly. (B) Backbone and interface quality estimation of synthesized domain assembly. (C) Case studies showing the diverse assembled domains. The designed backbone and the refolded backbones (grey) are aligned with assembled backbones (green and blue color coding to different domains). The evaluation was conducted without employing the best-of-three filter.
  • Figure 4: ProteinWeaver shows strong capacity in designing novel and high-quality backbones with significant improvement, particularly in long-chain structures.
  • Figure 5: Case studies showing ProteinWeaver potentially enables cooperative function design through the assembly of assigned proteins.
  • ...and 7 more figures