Table of Contents
Fetching ...

DiffNMR2: NMR Guided Sampling Acquisition Through Diffusion Model Uncertainty

Etienne Goffinet, Sen Yan, Fabrizio Gabellieri, Laurence Jennings, Lydia Gkoura, Filippo Castiglione, Ryan Young, Idir Malki, Ankita Singh, Thomas Launey

TL;DR

DiffNMR2 presents a diffusion-model guided sampling framework to accelerate 2D NMR acquisition by leveraging model uncertainty to adaptively select evolution times. Trained on real protein spectra, the method uses a Repaint-style diffusion inpainting pipeline to reconstruct masked frequency-time maps, enabling row-wise guided sampling. Across a real 100-protein dataset, Guided Sampling markedly outperforms baselines, achieving a 52.9% reduction in MSE at 10% acquisition and a 55.6% reduction in hallucinated peaks, with substantial wall-time savings for longer experiments. The approach integrates non-uniform sampling with diffusion-based reconstruction to deliver faster, high-fidelity spectral analyses and suggests broad applicability to higher-dimensional NMR and spectroscopy workflows.

Abstract

Nuclear Magnetic Resonance (NMR) spectrometry uses electro-frequency pulses to probe the resonance of a compound's nucleus, which is then analyzed to determine its structure. The acquisition time of high-resolution NMR spectra remains a significant bottleneck, especially for complex biological samples such as proteins. In this study, we propose a novel and efficient sub-sampling strategy based on a diffusion model trained on protein NMR data. Our method iteratively reconstructs under-sampled spectra while using model uncertainty to guide subsequent sampling, significantly reducing acquisition time. Compared to state-of-the-art strategies, our approach improves reconstruction accuracy by 52.9\%, reduces hallucinated peaks by 55.6%, and requires 60% less time in complex NMR experiments. This advancement holds promise for many applications, from drug discovery to materials science, where rapid and high-resolution spectral analysis is critical.

DiffNMR2: NMR Guided Sampling Acquisition Through Diffusion Model Uncertainty

TL;DR

DiffNMR2 presents a diffusion-model guided sampling framework to accelerate 2D NMR acquisition by leveraging model uncertainty to adaptively select evolution times. Trained on real protein spectra, the method uses a Repaint-style diffusion inpainting pipeline to reconstruct masked frequency-time maps, enabling row-wise guided sampling. Across a real 100-protein dataset, Guided Sampling markedly outperforms baselines, achieving a 52.9% reduction in MSE at 10% acquisition and a 55.6% reduction in hallucinated peaks, with substantial wall-time savings for longer experiments. The approach integrates non-uniform sampling with diffusion-based reconstruction to deliver faster, high-fidelity spectral analyses and suggests broad applicability to higher-dimensional NMR and spectroscopy workflows.

Abstract

Nuclear Magnetic Resonance (NMR) spectrometry uses electro-frequency pulses to probe the resonance of a compound's nucleus, which is then analyzed to determine its structure. The acquisition time of high-resolution NMR spectra remains a significant bottleneck, especially for complex biological samples such as proteins. In this study, we propose a novel and efficient sub-sampling strategy based on a diffusion model trained on protein NMR data. Our method iteratively reconstructs under-sampled spectra while using model uncertainty to guide subsequent sampling, significantly reducing acquisition time. Compared to state-of-the-art strategies, our approach improves reconstruction accuracy by 52.9\%, reduces hallucinated peaks by 55.6%, and requires 60% less time in complex NMR experiments. This advancement holds promise for many applications, from drug discovery to materials science, where rapid and high-resolution spectral analysis is critical.

Paper Structure

This paper contains 28 sections, 4 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: a) Regular NMR experiment. A 2D NMR resonance map in the time-time domain $(t_e,t_a)$ is produced by recording the response of nuclei subjected to two-pulse sequences, where each row is obtained with a different time delay (called evolution time: $t_e$) between the two pulses and $t_a$ represent the acquisition time. This map is first transformed into a frequency-time representation via a 1D Fourier Transform along the acquisition time axis: $(t_e,t_a) \to (t_e,f_a)$. A second 1D Fourier Transform along the evolution time axis produces the final spectrum: $(t_e,f_a)\to(f_e f_a)$. b) Non-uniform sampling (NUS). Differing from the regular NMR experiment, Non-Uniform Sampling selects a subset of the evolution times (Row-wise Scan). After 1D Fourier Transform on the acquisition time axis, existing methods such as Compressed Sensing kazimierczuk2011accelerated and Low-Rank Approximation qu2015accelerated are applied to reconstruct the frequency-time domain (Row Inpainting). c) Guided sampling workflow. Differing from NUS, we propose a novel dynamic sampling strategy by leveraging the diffusion model uncertainty. This is the first attempt using the diffusion model during NMR acquisition. The novelties are highlighted in blue texts.
  • Figure 2: Repaint pipeline for the reconstruction of NMR data. At variance with the conventional denoising process, in each step, we sample the known rows (top) from the input and reconstruct the inpainted rows from the UNet output (bottom).
  • Figure 3: Increasing the number of denoising timesteps (ts) improves the global metrics (MSE, R2) and lowers the hallucination ratio but increases the missed peaks.
  • Figure 4: An iteration step of 2% (pcps 2%) optimizes the global reconstruction metrics and the hallucination peaks at all masking levels.
  • Figure 5: Comparison from 2% to 50% of acquisition completion. Cross-sampling comparison: Guided Sampling approach (GS) compared with uniform sampling and Poisson-Gap. Cross-model comparison: GS compared with Compressed Sensing and Low-Rank Approximation. GS200 and GS400 correspond to the same Guided Sampling strategy with 200 and 400 inference time steps, respectively.
  • ...and 5 more figures