Table of Contents
Fetching ...

ProtPainter: Draw or Drag Protein via Topology-guided Diffusion

Zhengxi Lu, Shizhuo Cheng, Yuru Jiang, Yan Zhang, Min Zhang

TL;DR

ProtPainter addresses the challenge of flexible, precise topology control in protein backbone generation by conditioning diffusion-based synthesis on 3D curves. It introduces a two-stage pipeline: CurveEncoder-based curve sketching that annotates curves with SSE labels, followed by sketch-guided backbone sampling using a DDPM with Helix-Gating to modulate fusion strength according to helix content, aided by RoseTTAFold guidance for translational consistency. The work contributes a CurveEncoder, a retraining-free guided sampling strategy, and a topology-focused benchmark including the Protein Restoration Task and the scTF metric, demonstrating superior topology fidelity and designability (scTM>0.5, scTF>0.8 in many cases) and enabling drawing/dragging operations for curve-driven design. Across experiments, ProtPainter outperforms unconditional and prior topology-conditioned baselines on topology fidelity and designability, while enabling curve-based editing, hinge designs, and motif scaffolding with practical downstream relevance. The approach promises more natural topology-space navigation for protein design, with implications for binder design and multi-state engineering, albeit with current inference-time limitations to be addressed in future work.$

Abstract

Recent advances in protein backbone generation have achieved promising results under structural, functional, or physical constraints. However, existing methods lack the flexibility for precise topology control, limiting navigation of the backbone space. We present ProtPainter, a diffusion-based approach for generating protein backbones conditioned on 3D curves. ProtPainter follows a two-stage process: curve-based sketching and sketch-guided backbone generation. For the first stage, we propose CurveEncoder, which predicts secondary structure annotations from a curve to parametrize sketch generation. For the second stage, the sketch guides the generative process in Denoising Diffusion Probabilistic Modeling (DDPM) to generate backbones. During this process, we further introduce a fusion scheduling scheme, Helix-Gating, to control the scaling factors. To evaluate, we propose the first benchmark for topology-conditioned protein generation, introducing Protein Restoration Task and a new metric, self-consistency Topology Fitness (scTF). Experiments demonstrate ProtPainter's ability to generate topology-fit (scTF > 0.8) and designable (scTM > 0.5) backbones, with drawing and dragging tasks showcasing its flexibility and versatility.

ProtPainter: Draw or Drag Protein via Topology-guided Diffusion

TL;DR

ProtPainter addresses the challenge of flexible, precise topology control in protein backbone generation by conditioning diffusion-based synthesis on 3D curves. It introduces a two-stage pipeline: CurveEncoder-based curve sketching that annotates curves with SSE labels, followed by sketch-guided backbone sampling using a DDPM with Helix-Gating to modulate fusion strength according to helix content, aided by RoseTTAFold guidance for translational consistency. The work contributes a CurveEncoder, a retraining-free guided sampling strategy, and a topology-focused benchmark including the Protein Restoration Task and the scTF metric, demonstrating superior topology fidelity and designability (scTM>0.5, scTF>0.8 in many cases) and enabling drawing/dragging operations for curve-driven design. Across experiments, ProtPainter outperforms unconditional and prior topology-conditioned baselines on topology fidelity and designability, while enabling curve-based editing, hinge designs, and motif scaffolding with practical downstream relevance. The approach promises more natural topology-space navigation for protein design, with implications for binder design and multi-state engineering, albeit with current inference-time limitations to be addressed in future work.$

Abstract

Recent advances in protein backbone generation have achieved promising results under structural, functional, or physical constraints. However, existing methods lack the flexibility for precise topology control, limiting navigation of the backbone space. We present ProtPainter, a diffusion-based approach for generating protein backbones conditioned on 3D curves. ProtPainter follows a two-stage process: curve-based sketching and sketch-guided backbone generation. For the first stage, we propose CurveEncoder, which predicts secondary structure annotations from a curve to parametrize sketch generation. For the second stage, the sketch guides the generative process in Denoising Diffusion Probabilistic Modeling (DDPM) to generate backbones. During this process, we further introduce a fusion scheduling scheme, Helix-Gating, to control the scaling factors. To evaluate, we propose the first benchmark for topology-conditioned protein generation, introducing Protein Restoration Task and a new metric, self-consistency Topology Fitness (scTF). Experiments demonstrate ProtPainter's ability to generate topology-fit (scTF > 0.8) and designable (scTM > 0.5) backbones, with drawing and dragging tasks showcasing its flexibility and versatility.

Paper Structure

This paper contains 55 sections, 23 equations, 16 figures, 9 tables, 1 algorithm.

Figures (16)

  • Figure 1: Architecture. Sketching: given a 3D curve, $\text{SSE}_\text{curve}$ is predicted by CurveEncoder. Then a naive sketch is generated parametrically. Guided Sampling: the sketch is fused into a diffusion sampling process with the guidance of RoseTTAFold and Helix-Gating interpolation.
  • Figure 2: Sketch Fusion Scheduling with Helix-Gating. a. Helix-Gating splits the sampling process into two phases by comparing the helix percentage of $\hat{z}_0^t$ and $y$, enabling the scheduling of fusion. b. The curve space trajectories of different diffusion sampling processes.
  • Figure 3: Examples of ProtPainter on de novo protein design, binder design, and motif scaffolding. From left to right, the original structures are 6s9l, 1tqg, 7f4d_MR, 7f4d_GB, and 103l. Curves are visualized in 3D space. Other examples are shown in Figure \ref{['gallery']}.
  • Figure 4: Draw and edit process. (a) Structures are visualized in the MDS topology space, with their colors corresponding to respective operations. Novelty is measured as the maximum TM score relative to PDB burley2023rcsbberman2003announcing in the upper left corner. (b) Case study for actions of dragging, SSE editing, comprehensive tasks like hinge protein, and jointing.
  • Figure 5: scTF vs scTM and scRMSD. Figures from left to right show the test results on datasets HHH_ems, med, and GPCR respectively. The first and second rows show the relationship between scTM and scTF, and the relationship between scRMSD and scTF, respectively. The results have not been filtered by selection. $N_\text{curve}=50,N_\text{bb}=10,N_\text{seq}=8$.
  • ...and 11 more figures