Table of Contents
Fetching ...

DiffPack: A Torsional Diffusion Model for Autoregressive Protein Side-Chain Packing

Yangtian Zhang, Zuobai Zhang, Bozitao Zhong, Sanchit Misra, Jian Tang

TL;DR

DiffPack tackles protein side-chain packing under fixed backbones by modeling the joint distribution of four torsional angles in torsion space via diffusion, then generating $\chi_1$–$\chi_4$ autoregressively to avoid cumulative distortions. It leverages SE(3)-invariant GearNet-Edge architectures for score estimation and introduces inference enhancements (multi-round sampling, annealed temperature, confidence-based selection) to improve sample quality. Empirically, DiffPack achieves state-of-the-art angle accuracy on CASP13/14 with dramatically fewer parameters and also enhances AlphaFold2 side-chain predictions, demonstrating practical utility for protein design and refinement. The approach is further validated by ablations and case studies showing better chemical validity and the ability to capture complex interactions such as $\pi$ stacking, suggesting broad impact for structure prediction and design tasks.

Abstract

Proteins play a critical role in carrying out biological functions, and their 3D structures are essential in determining their functions. Accurately predicting the conformation of protein side-chains given their backbones is important for applications in protein structure prediction, design and protein-protein interactions. Traditional methods are computationally intensive and have limited accuracy, while existing machine learning methods treat the problem as a regression task and overlook the restrictions imposed by the constant covalent bond lengths and angles. In this work, we present DiffPack, a torsional diffusion model that learns the joint distribution of side-chain torsional angles, the only degrees of freedom in side-chain packing, by diffusing and denoising on the torsional space. To avoid issues arising from simultaneous perturbation of all four torsional angles, we propose autoregressively generating the four torsional angles from $χ_1$ to $χ_4$ and training diffusion models for each torsional angle. We evaluate the method on several benchmarks for protein side-chain packing and show that our method achieves improvements of $11.9\%$ and $13.5\%$ in angle accuracy on CASP13 and CASP14, respectively, with a significantly smaller model size ($60\times$ fewer parameters). Additionally, we show the effectiveness of our method in enhancing side-chain predictions in the AlphaFold2 model. Code is available at https://github.com/DeepGraphLearning/DiffPack.

DiffPack: A Torsional Diffusion Model for Autoregressive Protein Side-Chain Packing

TL;DR

DiffPack tackles protein side-chain packing under fixed backbones by modeling the joint distribution of four torsional angles in torsion space via diffusion, then generating autoregressively to avoid cumulative distortions. It leverages SE(3)-invariant GearNet-Edge architectures for score estimation and introduces inference enhancements (multi-round sampling, annealed temperature, confidence-based selection) to improve sample quality. Empirically, DiffPack achieves state-of-the-art angle accuracy on CASP13/14 with dramatically fewer parameters and also enhances AlphaFold2 side-chain predictions, demonstrating practical utility for protein design and refinement. The approach is further validated by ablations and case studies showing better chemical validity and the ability to capture complex interactions such as stacking, suggesting broad impact for structure prediction and design tasks.

Abstract

Proteins play a critical role in carrying out biological functions, and their 3D structures are essential in determining their functions. Accurately predicting the conformation of protein side-chains given their backbones is important for applications in protein structure prediction, design and protein-protein interactions. Traditional methods are computationally intensive and have limited accuracy, while existing machine learning methods treat the problem as a regression task and overlook the restrictions imposed by the constant covalent bond lengths and angles. In this work, we present DiffPack, a torsional diffusion model that learns the joint distribution of side-chain torsional angles, the only degrees of freedom in side-chain packing, by diffusing and denoising on the torsional space. To avoid issues arising from simultaneous perturbation of all four torsional angles, we propose autoregressively generating the four torsional angles from to and training diffusion models for each torsional angle. We evaluate the method on several benchmarks for protein side-chain packing and show that our method achieves improvements of and in angle accuracy on CASP13 and CASP14, respectively, with a significantly smaller model size ( fewer parameters). Additionally, we show the effectiveness of our method in enhancing side-chain predictions in the AlphaFold2 model. Code is available at https://github.com/DeepGraphLearning/DiffPack.
Paper Structure (40 sections, 16 equations, 10 figures, 7 tables, 2 algorithms)

This paper contains 40 sections, 16 equations, 10 figures, 7 tables, 2 algorithms.

Figures (10)

  • Figure 1: Overview of DiffPack. Given a protein sequence and backbone structure, we aim to model the conditional distribution of side-chain conformation. (A) Distribution of side-chain conformation is modeled through diffusion process in torsion space $\mathbb{T}^m$. An SE(3)-invariant network is used to learn the torus force field (torsion score). (B) Four torsion angles are generated autoregressively across all residues.
  • Figure 2: Illustration of four torsional angles.
  • Figure 3: Effects of rotating $\chi_1$.
  • Figure 4: Distribution of $\pi$-rotation-symmetry torsional angles (Blue) and $2\pi$-rotation-symmetry (Red).
  • Figure 5: Training loss curves for different diffusion models.
  • ...and 5 more figures