Table of Contents
Fetching ...

Towards Ultimate NMR Resolution with Deep Learning

Amir Jahangiri, Tatiana Agback, Ulrika Brath, Vladislav Orekhov

TL;DR

The paper addresses spectral resolution limits in multidimensional NMR by reframing the problem as the probability of a peak center, formalized as $P^3$. It introduces MR-Ai, a physics-inspired cross-objective neural network that maps spectra to $P^3$ and enables hyper-dimensional co-processing to boost resolution and robustness under nonuniform sampling. It defines a reference-free spectrum quality score (QSP) derived from the integral of $P^3$, and demonstrates hyper-dimensional Targeted Acquisition guided by this metric. The results show improved peak detectability, reduced spectral artifacts, and practical benefits for high-resolution NMR analysis of proteins, including 3D NUS regimes.

Abstract

In multidimensional NMR spectroscopy, practical resolution is defined as the ability to distinguish and accurately determine signal positions against a background of overlapping peaks, thermal noise, and spectral artifacts. In the pursuit of ultimate resolution, we introduce Peak Probability Presentations ($P^3$)- a statistical spectral representation that assigns a probability to each spectral point, indicating the likelihood of a peak maximum occurring at that location. The mapping between the spectrum and $P^3$ is achieved using MR-Ai, a physics-inspired deep learning neural network architecture, designed to handle multidimensional NMR spectra. Furthermore, we demonstrate that MR-Ai enables coprocessing of multiple spectra, facilitating direct information exchange between datasets. This feature significantly enhances spectral quality, particularly in cases of highly sparse sampling. Performance of MR-Ai and high value of the $P^3$ are demonstrated on the synthetic data and spectra of Tau, MATL1, Calmodulin, and several other proteins.

Towards Ultimate NMR Resolution with Deep Learning

TL;DR

The paper addresses spectral resolution limits in multidimensional NMR by reframing the problem as the probability of a peak center, formalized as . It introduces MR-Ai, a physics-inspired cross-objective neural network that maps spectra to and enables hyper-dimensional co-processing to boost resolution and robustness under nonuniform sampling. It defines a reference-free spectrum quality score (QSP) derived from the integral of , and demonstrates hyper-dimensional Targeted Acquisition guided by this metric. The results show improved peak detectability, reduced spectral artifacts, and practical benefits for high-resolution NMR analysis of proteins, including 3D NUS regimes.

Abstract

In multidimensional NMR spectroscopy, practical resolution is defined as the ability to distinguish and accurately determine signal positions against a background of overlapping peaks, thermal noise, and spectral artifacts. In the pursuit of ultimate resolution, we introduce Peak Probability Presentations ()- a statistical spectral representation that assigns a probability to each spectral point, indicating the likelihood of a peak maximum occurring at that location. The mapping between the spectrum and is achieved using MR-Ai, a physics-inspired deep learning neural network architecture, designed to handle multidimensional NMR spectra. Furthermore, we demonstrate that MR-Ai enables coprocessing of multiple spectra, facilitating direct information exchange between datasets. This feature significantly enhances spectral quality, particularly in cases of highly sparse sampling. Performance of MR-Ai and high value of the are demonstrated on the synthetic data and spectra of Tau, MATL1, Calmodulin, and several other proteins.

Paper Structure

This paper contains 18 sections, 4 equations, 24 figures, 2 tables.

Figures (24)

  • Figure 1: The $P^3$ of experimental 2D NMR data. 2D $^{1}$H-$^{15}$N --- TROSY spectrum of MALT1 protein is shown with the black (positive) and orange (negative) contours, while the corresponding $P^3$ spectrum is depicted in green. Inset panels $a_1$ and $a_2$ highlight zoomed-in regions (marked with red rectangles) of the spectrum; $b_1$ and $b_2$ confirm the resolved peaks in $a_1$ and $a_2$ by showing the corresponding $^{1}$H-$^{13}$C slice from the 3D HNCO spectrum.
  • Figure 2: $P^3$ and assignment procedure of 3D NUS reconstructed spectra with CS-IST for MALT1 Protein.(a) Superposition of the $^{1}$H-$^{15}$N 2D planes, extracted at $^{13}$C$^\alpha$ frequency of $671R$: 55.01 ppm from 3D HNCA (green and blue for TIP and $P^3$) and HN(CO)CA (orange and red for TIP and $P^3$) spectra, and cross-peaks of the $^{1}$H-$^{15}$N 2D TROSY spectrum (grey): Sequential $(i)-(i-1)$ peak labeled $672L$ and enlarged in the inset box number one has the $^{1}$H and $^{15}$N chemical shifts of residue $672L$ and the $^{13}$C chemical shift of residue $671R$. The target $(i-1)$ cross-peak with the $^{1}$H, $^{15}$N and $^{13}$C chemical shifts of residue $671R$, is enlarged in box 5. Three additional sequential $(i)-(i-1)$ and two $(i-1)$ peaks appear in the plane because they exhibit $^{13}$C chemical shifts similar to that of $671R$. These peaks correspond to residues $679K-678L$ (peak 2), $457Q-456L$ (peak 3), $534E-533A$ (peak 4), $533A$ (peak 6), and one unidentified peak (7). Compared with $P^3$, due to limited $^{13}$C resolution, the 3D HNCA (green) and HN(CO)CA (orange) planes also include eleven additional cross-peaks, marked with red stars making the assignment procedure more complicated. (b) The superposition of four strips $^{1}$H-$^{13}$C 2D planes, extracted at $^{15}$N: 125.96, 122.04, 124.83 and 131.01 ppm, respectively, from 3D HNCA (green and blue for TIP and $P^3$) and HN(CO)CA (orange and red for TIP and $P^3$) spectra: The dashed line illustrates the flow of assignments through $(n)$ to $(n-1)$ cross peaks, starting with residue $462D$, and continuing through $461L$, $460L$ and $459F$. (c) The superposition of the $^{1}$H-$^{13}$C 2D planes, extracted at $^{15}$N: 119.14 ppm from 3D HN(CA)CO (black and pink for TIP and $P^3$) spectra: The inside panel displays 1D projections corresponding to the dash column at $^{1}$H: 8.038 ppm, representing the chemical shifts of residue $545K$ and $544G$ where the vertical axes (on the right and left) are scaled according to Intensity (Int) and Probability (P), respectively.
  • Figure 3: Statistics on Peak Detection of $P^3$ in Synthetic 2D and 3D Spectra. Green and red bars with error bars represent the mean and standard deviation of recall and precision quality metrics across 10 synthetic NMR spectra, each containing 256 peaks, for (a) 2D US, (b) 3D US (dark-colored bars), and 3D 15% NUS reconstructed using CS-IST (light-colored bars). Recall is defined as the ratio of the correctly $detected$ pixels to all $detectable$ pixels, while precision is the ratio of correctly $detected$ pixels to all $detected$ pixels. A pixel is considered as $detected$ when its $P^3$ value is above the probability threshold indicated on the horizontal axis of the chart. A pixel is correctly $detected$ if it is found in the vicinity of a $detectable$ pixel. The $detectable$ pixels are defined as those near the maxima of the ground truth peaks with intensities higher than $2\sigma$-noise for the US spectra. To account for the shorter experiment time in the 15% NUS spectra, a threshold of $5\sigma$-noise from the corresponding US spectra was used.
  • Figure 4: Effect of peak intensity and overlap on the peak detection in the $P^3$ for the synthetic 2D and 3D Spectra. The probability of detecting peak maxima across 2,560 ground truth peaks in 10 synthetic spectra are plotted versus the signal-to-noise ratio (SNR) and overlap conditions (color scale) for (a) 2D US, (b) 3D US, and (c) 3D 15% NUS reconstructed using CS-IST spectra. For each ground truth peak with intensity $I_0$, SNR is calculated as $I_0$ divided by $\sigma$-noise in the US spectra (or the corresponding US spectra for NUS data). The overlap score is defined as $\Sigma_i \frac{I_i}{I_0 d^2}$, where $I_i$ is the intensity of a neighboring peak $i$ residing at a distance $d$ less than 16 pixels. The gray arias indicate a range of signal amplitudes where the ground truce peaks are theoretically undetectable with intensities below $2\sigma$ in the US spectra; in order to account for the shortened measurement time in the 15% NUS spectra, the theoretical detention threshold was set at $5\sigma$-noise in the corresponding US spectra.
  • Figure 5: Targeted Acquisition with the Reference-Free Quantitative Spectrum Quality Score with $P^3$ ($QSP$). The number of detectable peaks estimated with the $QSP$ in different 3D spectra reconstructed using CS-IST is plotted versus NUS fraction build-up during in course of the TA acquisition. (a) Synthetic HNCO-type spectra reconstructed using CS-IST with 128 peaks (magenta) and 256 peaks (black). MR-Ai calculated $QSP$ values for standalone spectra (dashed lines) and using hyper-dimensional co-processing with a supporting 2D spectrum (solid lines).. (b) Experimental 3D spectra of Calmodulin protein (16.7 kDa). The $QSP$ was calculated using the 2D $^{1}$H-$^{15}$N projection of the most sensitive 3D 50% NUS HNCO spectrum as support.
  • ...and 19 more figures