Table of Contents
Fetching ...

The Effect of Hydration and Dynamics on the Mass Density of Single Proteins

Cameron C. W. McAllister, Lucas S. P. Rudden, Elizabeth H. C. Bromley, Matteo T. Degiacomi

TL;DR

The paper demonstrates that single-protein mass density in solution is lower than the commonly cited $1.35\ \mathrm{g\,cm^{-3}}$ and is not correlated with molecular weight, but shows systematic correlations with residue composition and overall charge. A voxel-based MD approach explicitly incorporating hydration yields a mean density around $1.294\text{--}1.296\ \mathrm{g\,cm^{-3}}$, while buried interior waters have minimal effect and surface hydration increases the surrounding water density. The authors show that density can be accurately predicted from sequence-derived features via Random Forest regressors, enabling fast sequence-only density estimates, and reveal dynamic density variations within individual proteins (e.g., BPTI and titin) under equilibrium and mechanical perturbations. Additionally, hydration shells raise local water density (first shell ~$1.1$–$1.5\ \mathrm{g\,cm^{-3}}$) and are structurally organized, suggesting that surrounding solvent structure must be considered in experiments and models using protein densities.

Abstract

The density of a protein molecule is a key property within a variety of experimental techniques. We present a computational method for determining protein mass density that explicitly incorporates hydration effects. Our approach uses molecular dynamics simulations to quantify the volume of solvent excluded by a protein. Applied to a dataset of 260 soluble proteins, this yields an average density of 1.296 g cm-3, notably lower than the widely cited value of 1.35 g cm-3. Contrary to previous suggestions, we find no correlation between protein density and molecular weight. We instead find correlations with residue composition, particularly with hydrophobic amino acid content. Using these correlations, we train a regressor capable of accurately predicting protein density from sequence-derived features alone. Examining the effect of incorporating water molecules on the measured density, we find that water molecules buried in internal cavities have a negligible effect, whereas those at the surface have a profound impact. Furthermore, by calculating the density of a titin domain and of the Bovine Pancreatic Trypsin over molecular dynamics trajectories, we show that individual proteins can occupy states with close but distinguishable densities. Finally, we analyse the density of water in the vicinity of proteins, showing that the first two hydration shells exhibit higher density than bulk water. When included in cumulative density calculations, these hydration layers contribute to a net increase in local solvent density. Overall, we find that proteins are less dense than previously reported, which is offset by their ability to induce a higher density of water in their vicinity.

The Effect of Hydration and Dynamics on the Mass Density of Single Proteins

TL;DR

The paper demonstrates that single-protein mass density in solution is lower than the commonly cited and is not correlated with molecular weight, but shows systematic correlations with residue composition and overall charge. A voxel-based MD approach explicitly incorporating hydration yields a mean density around , while buried interior waters have minimal effect and surface hydration increases the surrounding water density. The authors show that density can be accurately predicted from sequence-derived features via Random Forest regressors, enabling fast sequence-only density estimates, and reveal dynamic density variations within individual proteins (e.g., BPTI and titin) under equilibrium and mechanical perturbations. Additionally, hydration shells raise local water density (first shell ~) and are structurally organized, suggesting that surrounding solvent structure must be considered in experiments and models using protein densities.

Abstract

The density of a protein molecule is a key property within a variety of experimental techniques. We present a computational method for determining protein mass density that explicitly incorporates hydration effects. Our approach uses molecular dynamics simulations to quantify the volume of solvent excluded by a protein. Applied to a dataset of 260 soluble proteins, this yields an average density of 1.296 g cm-3, notably lower than the widely cited value of 1.35 g cm-3. Contrary to previous suggestions, we find no correlation between protein density and molecular weight. We instead find correlations with residue composition, particularly with hydrophobic amino acid content. Using these correlations, we train a regressor capable of accurately predicting protein density from sequence-derived features alone. Examining the effect of incorporating water molecules on the measured density, we find that water molecules buried in internal cavities have a negligible effect, whereas those at the surface have a profound impact. Furthermore, by calculating the density of a titin domain and of the Bovine Pancreatic Trypsin over molecular dynamics trajectories, we show that individual proteins can occupy states with close but distinguishable densities. Finally, we analyse the density of water in the vicinity of proteins, showing that the first two hydration shells exhibit higher density than bulk water. When included in cumulative density calculations, these hydration layers contribute to a net increase in local solvent density. Overall, we find that proteins are less dense than previously reported, which is offset by their ability to induce a higher density of water in their vicinity.

Paper Structure

This paper contains 24 sections, 5 figures.

Figures (5)

  • Figure 1: Schematic representation of our protein volume calculation method. Red and blue circles represent the van der Waals radius of protein and water atoms, respectively. The space surrounding the protein is divided in a fine grid, coloured according to which atom is the closest. (top) the red region represents protein occupied volume. (bottom) the red region represents the volume occupied by protein and solvation shell. This is calculated by considering as part of the protein any water atom within a given cutoff distance from the centres of any protein atoms (transparent circles surrounding red circles). This region may be incrementally expanded to allow for the characterisation of changes in water density at different distances from the protein.
  • Figure 2: Comparison between calculated densities, and densities predicted by a Random Forest Regressor (struct-RFR). The RFR was trained on a set of 20 sequence- and structure-based features gathered from our 260-protein dataset, and the equilibrated crystal structure of BPTI (PDB: 5PTI) and titin (PDB: 1TIT) in its folded and extended state. The identity line is shown in black for comparison. We find a PCC of 0.976 for the 260-protein dataset. The RFR accurately predicts the densities of conformations of BPTI (of which one structure is present in the training set) and folded titin (not part of the training set). The density of mechanically unfolded titin conformations are poorly predicted. Above, a Kernel Density Estimation plot of the distribution of mean calculated densities in the 260-protein dataset, with the overall mean density of 1.296 $\pm$ 0.001 g cm-3 annotated with a dashed vertical line.
  • Figure 3: BPTI features states with distinct densities. A 1 ms molecular dynamics simulation of BPTI can be subdivided in five metastable states via Markov State Modelling (each represented by 50 overlaid conformers at the top). The secondary structure is coloured as: $\alpha$-helix in red, $3_{10}$-helix in blue, $\beta$-sheet in turquoise, turns in light blue and coil in grey. These conformers differ in their degree of compaction, as captured by the prevalence of aliphatic hydrophobic residues in their interior and their density. The structures of states 4 and 5 most closely resemble the crystal structure of BPTI (PDB: 5PTI), with a difference of only a more significant turn in the coil region of residues 12 and 13 in state 5. States 1 and 2 are differentiated from the others by the loss of structure of the small N-terminal ${3_{10}}$-helix, though for some conformers this feature is intact.
  • Figure 4: Unfolding of titin under mechanical stress reveals conformer sub-populations with distinct densities. For each titin conformation in a steered molecular dynamics simulation (see example conformers at the top), we calculate density and percentage of preserved secondary structure. The distribution of secondary structure content (kernel density estimation in the upper graph) reveals two distinct sub-populations, identified with red and blue colours in the scatter plot. In the right graph, kernel density estimations reveal that these sub-populations feature distinct density distributions (red and blue lines). The density distribution of the whole simulation is shown in black.
  • Figure 5: Relationship between protein and water density, averaged over the whole protein dataset. The two top graphs, in palatinate colour, report on water radial distribution function (RDF) and water mass density. The bottom two graphs, in blue, report on the cumulated effect on measured mass density of water, or the combined protein-water system, when accounting for an increasingly large water shell around the protein. The effective protein-water mass density decreases the more water is included, with a non-monotonical trend determined by water having density higher than bulk value in the first two hydration shells.