Table of Contents
Fetching ...

Hierarchical geometric deep learning enables scalable analysis of molecular dynamics

Zihan Pengmei, Spencer C. Guo, Chatipat Lorpaiboon, Aaron R. Dinner

TL;DR

This paper tackles the scalability problem of applying geometric GNNs to long-timescale biomolecular dynamics by introducing a token-merging module (TMM) that compresses local structure into fragment-level tokens. Paired with FlashAttention, the approach enables transformer-based nonlocal interaction learning on systems with thousands of residues on a single GPU. Through VAMP and SPIB analyses on ADK and nsp13 datasets, the method demonstrates improved memory efficiency, faster training, and interpretable attention that aligns with known biomolecular motions. The work broadens the applicability of deep learning to complex biomolecular dynamics and provides a scalable, interpretable pipeline for kinetic analysis and MSM construction.

Abstract

Molecular dynamics simulations can generate atomically detailed trajectories of complex systems, but analyzing these dynamics can be challenging when systems lack well-established quantitative descriptors (features). Graph neural networks (GNNs) in which messages are passed between nodes that represent atoms that are spatial neighbors promise to obviate manual feature engineering, but the use of GNNs with biomolecular systems of more than a few hundred residues has been limited in the context of analyzing dynamics by both difficulties in capturing the details of long-range interactions with message passing and the memory and runtime requirements associated with large graphs. Here, we show how local information can be aggregated to reduce memory and runtime requirements without sacrificing atomic detail. We demonstrate that this approach opens the door to analyzing simulations of protein-nucleic acid complexes with thousands of residues on single GPUs within minutes. For systems with hundreds of residues, for which there are sufficient data to make quantitative comparisons, we show that the approach improves performance and interpretability.

Hierarchical geometric deep learning enables scalable analysis of molecular dynamics

TL;DR

This paper tackles the scalability problem of applying geometric GNNs to long-timescale biomolecular dynamics by introducing a token-merging module (TMM) that compresses local structure into fragment-level tokens. Paired with FlashAttention, the approach enables transformer-based nonlocal interaction learning on systems with thousands of residues on a single GPU. Through VAMP and SPIB analyses on ADK and nsp13 datasets, the method demonstrates improved memory efficiency, faster training, and interpretable attention that aligns with known biomolecular motions. The work broadens the applicability of deep learning to complex biomolecular dynamics and provides a scalable, interpretable pipeline for kinetic analysis and MSM construction.

Abstract

Molecular dynamics simulations can generate atomically detailed trajectories of complex systems, but analyzing these dynamics can be challenging when systems lack well-established quantitative descriptors (features). Graph neural networks (GNNs) in which messages are passed between nodes that represent atoms that are spatial neighbors promise to obviate manual feature engineering, but the use of GNNs with biomolecular systems of more than a few hundred residues has been limited in the context of analyzing dynamics by both difficulties in capturing the details of long-range interactions with message passing and the memory and runtime requirements associated with large graphs. Here, we show how local information can be aggregated to reduce memory and runtime requirements without sacrificing atomic detail. We demonstrate that this approach opens the door to analyzing simulations of protein-nucleic acid complexes with thousands of residues on single GPUs within minutes. For systems with hundreds of residues, for which there are sufficient data to make quantitative comparisons, we show that the approach improves performance and interpretability.

Paper Structure

This paper contains 25 sections, 10 equations, 12 figures, 4 tables, 1 algorithm.

Figures (12)

  • Figure 1: Systems studied in this work. A. The C2A domain of synaptotagmin (PDB: 2R83fuson_structure_2007). B. Apo-adenylate kinase in an open conformation (PDB: 4AKEmuller_adenylate_1996). The LID, NMP, and CORE domains are shown in blue, yellow, and gray, respectively. C. SARS-CoV-2 helicase nsp13 bound to ADP in an open conformationchen_ensemble_2022. The N-terminal zinc-binding domain (ZBD), stalk (S), 1B, RecA1, and RecA2 domains are shown in green, salmon, magenta, tan, and red, respectively. D. SARS-CoV-2 helicase replication-transcription complex in 1B-open state (PDB: 7RDXchen_ensemble_2022). The nsp13 helicase domains are colored as in C. Structures are shown to scale.
  • Figure 2: A schematic illustration of the geom2vec workflow with the the proposed token-merging module (shaded). The token-merging module comprises a fragment-graph convolution layer acting on a coarsened graph defined by residue-wise coordinates and a token-merging MLP operating on fixed windows along the sequence. A sinusoidal positional encoding is added to the reduced token set before it enters the transformer token mixer.
  • Figure 3: Runtime (top) and peak GPU memory (bottom) for TMM as functions of window size $w$ for synaptotagmin C2A (128 residues), ADK (214 residues), nsp13-ADP (592 residues), and nsp13-RNA (2645 residues). Numbers reported are for a minibatch size of 1000 samples; for nsp13-ADP and nsp13-RNA, we use a minibatch size of 50 in practice and then scale the measured values to those for a minibatch size of 1000.
  • Figure 4: Maximum training (top) and validation (bottom) VAMP-2 scores for ADK. Each group of bars corresponds to a different graph operator for the TMM graph-convolution layer (see Appendix \ref{['app:local-mp']}). A window size of 1 is equivalent to no token merging. The training and validation data are held fixed across repeated runs and models. "RGGC" is a GNN baseline consisting of 4 RGGC layers with mean pooling (no token mixer). Error bars show standard deviations over three runs.
  • Figure 5: Maximum training (top) and validation (bottom) VAMP-2 scores for nsp13-ADP. One validation step is taken every 40 training steps. Each group of bars corresponds to a different graph operator for the TMM graph-convolution layer (see Appendix \ref{['app:local-mp']}). A window size of 1 is equivalent to no token merging. Error bars show standard deviation over three runs.
  • ...and 7 more figures