Table of Contents
Fetching ...

MDIntrinsicDimension: Dimensionality-Based Analysis of Collective Motions in Macromolecules from Molecular Dynamics Trajectories

Irene Cazzaniga, Toni Giorgino

TL;DR

The paper addresses the challenge of quantifying the effective dimensionality of biomolecular conformational space from MD trajectories. It introduces MDIntrinsicDimension, a Python package that uses rotation- and translation-invariant internal-coordinate projections and scikit-dimension estimators (default TwoNN) to estimate intrinsic dimension. Three analysis modes (whole-molecule, sliding-window along sequence, and secondary-structure elements) produce global and time-resolved IDs, enabling detection of transitions and regional heterogeneity. Applied to DESRES folding trajectories of villin HP35 and NTL9, ID complements RMSD and traditional descriptors, revealing localized flexibility and transient intermediates, and offering a data-driven lens for building collective variables and Markov models.

Abstract

Molecular dynamics (MD) simulations provide atomistic insights into the structure, dynamics, and function of biomolecules by generating time-resolved, high-dimensional trajectories. Analyzing such data benefits from estimating the minimal number of variables required to describe the explored conformational manifold, known as the intrinsic dimension (ID). We present MDIntrinsicDimension, an open-source Python package that estimates ID directly from MD trajectories by combining rotation- and translation-invariant molecular projections (e.g., backbone dihedrals and inter-residue distances) with state-of-the-art estimators. The package provides three complementary analysis modes: whole-molecule ID; sliding windows along the sequence; and per-secondary-structure elements. It computes both overall ID (a single summary value) and instantaneous, time-resolved ID that can reveal transitions and heterogeneity over time. We illustrate the approach on fast folding-unfolding trajectories from the DESRES dataset, demonstrating that ID complements conventional geometric descriptors by highlighting spatially localized flexibility and differences across structural segments.

MDIntrinsicDimension: Dimensionality-Based Analysis of Collective Motions in Macromolecules from Molecular Dynamics Trajectories

TL;DR

The paper addresses the challenge of quantifying the effective dimensionality of biomolecular conformational space from MD trajectories. It introduces MDIntrinsicDimension, a Python package that uses rotation- and translation-invariant internal-coordinate projections and scikit-dimension estimators (default TwoNN) to estimate intrinsic dimension. Three analysis modes (whole-molecule, sliding-window along sequence, and secondary-structure elements) produce global and time-resolved IDs, enabling detection of transitions and regional heterogeneity. Applied to DESRES folding trajectories of villin HP35 and NTL9, ID complements RMSD and traditional descriptors, revealing localized flexibility and transient intermediates, and offering a data-driven lens for building collective variables and Markov models.

Abstract

Molecular dynamics (MD) simulations provide atomistic insights into the structure, dynamics, and function of biomolecules by generating time-resolved, high-dimensional trajectories. Analyzing such data benefits from estimating the minimal number of variables required to describe the explored conformational manifold, known as the intrinsic dimension (ID). We present MDIntrinsicDimension, an open-source Python package that estimates ID directly from MD trajectories by combining rotation- and translation-invariant molecular projections (e.g., backbone dihedrals and inter-residue distances) with state-of-the-art estimators. The package provides three complementary analysis modes: whole-molecule ID; sliding windows along the sequence; and per-secondary-structure elements. It computes both overall ID (a single summary value) and instantaneous, time-resolved ID that can reveal transitions and heterogeneity over time. We illustrate the approach on fast folding-unfolding trajectories from the DESRES dataset, demonstrating that ID complements conventional geometric descriptors by highlighting spatially localized flexibility and differences across structural segments.

Paper Structure

This paper contains 15 sections, 6 figures, 1 table.

Figures (6)

  • Figure 1: (A) Structure of the HP35 villin headpiece (PDB 2F4K) used in the case study and (B) topology diagram.
  • Figure 2: Folded-vs-unfolded ID of the villin dynamic manifold, computed by the estimators available at scikit-dimension package. Each point represents the mean value of the folded states ($x$ axis) and unfolded states ($y$ axis); error bars indicate standard deviations over three replicas. Methods abbreviations are in Supplementary Table \ref{['tab:scikit_estimators']}. Projection: Ramachandran angles $\phi$ and $\psi$, whole protein. KNN not shown due to its high variance.
  • Figure 3: ID shifts between folded (F) and unfolded (U) states of villin under different projections. Dist.: pairwise distances between all carbon--carbon pairs; Dist. 3: pairwise distances every 3rd carbon; $\phi, \psi$: Ramachandran angles; $\chi$: sidechain dihedrals; Sin/Cos: trigonometric embedding of dihedrals.
  • Figure 4: (A) Instantaneous intrinsic dimension (ID) of villin trajectories over time. (B) Distribution of ID values across the trajectory. (C) Relationship between instantaneous ID and RMSD relative to the folded state. The separation between folded and unfolded ensembles is clearer when using ID than RMSD. States are color-coded as folded (violet) and unfolded (green). Analysis performed on the projection to $\phi$ and $\psi$ Ramachandran angles.
  • Figure 5: ID computed on local segments of villin. (A) Sequence-wise ID computed with section_id() using a window of 15 and stride of 3. (B) Secondary structure element-wise ID from secondary_structure_id(); ranges indicate the first and last residue number of the secondary structure element (simplified DSSP: C, coil; H, helix). In both cases $\phi$ and $\psi$ dihedral angles were used as a projection.
  • ...and 1 more figures