Table of Contents
Fetching ...

ProtSCAPE: Mapping the landscape of protein conformations in molecular dynamics

Siddharth Viswanath, Dhananjay Bhaskar, David R. Johnson, Joao Felipe Rocha, Egbert Castro, Jackson D. Grady, Alex T. Grigas, Michael A. Perlmutter, Corey S. O'Hern, Smita Krishnaswamy

TL;DR

A novel deep learning architecture, Protein Transformer with Scattering, Attention, and Positional Embedding (ProtSCAPE), which leverages the geometric scattering transform alongside transformer-based attention mechanisms to capture protein dynamics from molecular dynamics simulations.

Abstract

Understanding the dynamic nature of protein structures is essential for comprehending their biological functions. While significant progress has been made in predicting static folded structures, modeling protein motions on microsecond to millisecond scales remains challenging. To address these challenges, we introduce a novel deep learning architecture, Protein Transformer with Scattering, Attention, and Positional Embedding (ProtSCAPE), which leverages the geometric scattering transform alongside transformer-based attention mechanisms to capture protein dynamics from molecular dynamics (MD) simulations. ProtSCAPE utilizes the multi-scale nature of the geometric scattering transform to extract features from protein structures conceptualized as graphs and integrates these features with dual attention structures that focus on residues and amino acid signals, generating latent representations of protein trajectories. Furthermore, ProtSCAPE incorporates a regression head to enforce temporally coherent latent representations.

ProtSCAPE: Mapping the landscape of protein conformations in molecular dynamics

TL;DR

A novel deep learning architecture, Protein Transformer with Scattering, Attention, and Positional Embedding (ProtSCAPE), which leverages the geometric scattering transform alongside transformer-based attention mechanisms to capture protein dynamics from molecular dynamics simulations.

Abstract

Understanding the dynamic nature of protein structures is essential for comprehending their biological functions. While significant progress has been made in predicting static folded structures, modeling protein motions on microsecond to millisecond scales remains challenging. To address these challenges, we introduce a novel deep learning architecture, Protein Transformer with Scattering, Attention, and Positional Embedding (ProtSCAPE), which leverages the geometric scattering transform alongside transformer-based attention mechanisms to capture protein dynamics from molecular dynamics (MD) simulations. ProtSCAPE utilizes the multi-scale nature of the geometric scattering transform to extract features from protein structures conceptualized as graphs and integrates these features with dual attention structures that focus on residues and amino acid signals, generating latent representations of protein trajectories. Furthermore, ProtSCAPE incorporates a regression head to enforce temporally coherent latent representations.

Paper Structure

This paper contains 27 sections, 7 theorems, 45 equations, 7 figures, 4 tables.

Key Result

Theorem B.1

The wavelet transform is stable to small graph perturbations. More specifically, let $G'=(V',E')$ be a perturbed version of $G$ with $|V|=|V'|=n$. Then where $C$ is a constant depending only on the spectral gaps of the graphs $G$ and $G'$ and in particular does not depend on $J$ or on the choice of scales $t_j$.

Figures (7)

  • Figure 1: (A) The ProtSCAPE architecture. (B) PHATE plot of latent representations. (C) Conformations of the protein Kinase C. (D) High attention scores correspond to more flexible residues.
  • Figure 2: Interpolation between open and closed conformations of the MurD protein
  • Figure 3: Representation of the latent space (1), conformation samples(2) and Attention read-outs(3) for two different proteins simulations: Hirustasin(A) and 50S Ribosomal protein L30 (B). As we can see the attention is higher on parts of the protein that are flexible (i.e. change over different time points)
  • Figure 4: Plots of ground truth vs. reconstructed residue coordinates at withheld time points. The overlay shows good agreement between the ground truth structure (colored in magenta) and the ProtSCAPE reconstructed structure (colored in green) obtained by decoding the latent representations of the protein structure at withheld timepoints.
  • Figure 5: (A) describes the GB3 protein changing conformations at different timepoints in the molecular dynamics simulation. (B) Latent representations captured by ProtSCAPE visualized using PHATE
  • ...and 2 more figures

Theorems & Definitions (13)

  • Theorem B.1
  • Theorem B.2
  • Theorem J.2.1
  • proof
  • Lemma J.2.2
  • Theorem J.3.1: Special Case of Theorem 4.3 of perlmutter2023understanding
  • Remark J.3.2
  • Theorem J.3.3
  • proof : The proof of Theorem \ref{['thm: wavelet stability']}
  • Lemma J.4.1
  • ...and 3 more