Table of Contents
Fetching ...

The Sim-to-Real Gap in MRS Quantification: A Systematic Deep Learning Validation for GABA

Zien Ma, S. M. Shermer, Oktay Karakuş, Frank C. Langbein

TL;DR

This work investigates and validate deep learning for quantifying complex, low-SNR, overlapping signals from MEGA-PRESS spectra, devise a convolutional neural network (CNN) and a Y-shaped autoencoder (YAE) and select the best models via Bayesian optimisation on 10,000 simulated spectra from slice-profile-aware MEGA-PRESS simulations.

Abstract

Magnetic resonance spectroscopy (MRS) is used to quantify metabolites in vivo and estimate biomarkers for conditions ranging from neurological disorders to cancers. Quantifying low-concentration metabolites such as GABA ($γ$-aminobutyric acid) is challenging due to low signal-to-noise ratio (SNR) and spectral overlap. We investigate and validate deep learning for quantifying complex, low-SNR, overlapping signals from MEGA-PRESS spectra, devise a convolutional neural network (CNN) and a Y-shaped autoencoder (YAE), and select the best models via Bayesian optimisation on 10,000 simulated spectra from slice-profile-aware MEGA-PRESS simulations. The selected models are trained on 100,000 simulated spectra. We validate their performance on 144 spectra from 112 experimental phantoms containing five metabolites of interest (GABA, Glu, Gln, NAA, Cr) with known ground truth concentrations across solution and gel series acquired at 3 T under varied bandwidths and implementations. These models are further assessed against the widely used LCModel quantification tool. On simulations, both models achieve near-perfect agreement (small MAEs; regression slopes $\approx 1.00$, $R^2 \approx 1.00$). On experimental phantom data, errors initially increased substantially. However, modelling variable linewidths in the training data significantly reduced this gap. The best augmented deep learning models achieved a mean MAE for GABA over all phantom spectra of 0.151 (YAE) and 0.160 (FCNN) in max-normalised relative concentrations, outperforming the conventional baseline LCModel (0.220). A sim-to-real gap remains, but physics-informed data augmentation substantially reduced it. Phantom ground truth is needed to judge whether a method will perform reliably on real data.

The Sim-to-Real Gap in MRS Quantification: A Systematic Deep Learning Validation for GABA

TL;DR

This work investigates and validate deep learning for quantifying complex, low-SNR, overlapping signals from MEGA-PRESS spectra, devise a convolutional neural network (CNN) and a Y-shaped autoencoder (YAE) and select the best models via Bayesian optimisation on 10,000 simulated spectra from slice-profile-aware MEGA-PRESS simulations.

Abstract

Magnetic resonance spectroscopy (MRS) is used to quantify metabolites in vivo and estimate biomarkers for conditions ranging from neurological disorders to cancers. Quantifying low-concentration metabolites such as GABA (-aminobutyric acid) is challenging due to low signal-to-noise ratio (SNR) and spectral overlap. We investigate and validate deep learning for quantifying complex, low-SNR, overlapping signals from MEGA-PRESS spectra, devise a convolutional neural network (CNN) and a Y-shaped autoencoder (YAE), and select the best models via Bayesian optimisation on 10,000 simulated spectra from slice-profile-aware MEGA-PRESS simulations. The selected models are trained on 100,000 simulated spectra. We validate their performance on 144 spectra from 112 experimental phantoms containing five metabolites of interest (GABA, Glu, Gln, NAA, Cr) with known ground truth concentrations across solution and gel series acquired at 3 T under varied bandwidths and implementations. These models are further assessed against the widely used LCModel quantification tool. On simulations, both models achieve near-perfect agreement (small MAEs; regression slopes , ). On experimental phantom data, errors initially increased substantially. However, modelling variable linewidths in the training data significantly reduced this gap. The best augmented deep learning models achieved a mean MAE for GABA over all phantom spectra of 0.151 (YAE) and 0.160 (FCNN) in max-normalised relative concentrations, outperforming the conventional baseline LCModel (0.220). A sim-to-real gap remains, but physics-informed data augmentation substantially reduced it. Phantom ground truth is needed to judge whether a method will perform reliably on real data.
Paper Structure (36 sections, 7 equations, 9 figures, 13 tables)

This paper contains 36 sections, 7 equations, 9 figures, 13 tables.

Figures (9)

  • Figure 1: The sequence diagram (a) shows the RF and gradient pulses with actual pulse shapes and timings used in the simulations. The initial excitation pulse is modelled as an ideal (instantaneous) slice-selective $90^\circ$ pulse and the corresponding slice selection gradient $G_z$ is therefore omitted. The excitation pulse excites a slice of thickness $3cm$ perpendicular to the $z$-axis. The refocusing pulses refocus the magnetisation of $3cm$ thick slabs perpendicular to the $x$ and $y$ axis, respectively, to define the localised voxel. What differentiates the MEGA-PRESS sequence from the standard PRESS sequence is the presence of two $20ms$ frequency-selective Gaussian editing pulses (yellow and green) at $1.9ppm$ for the ON acquisition. For the OFF spectra the editing pulses could in principle be omitted, but the simulation follows the experimental implementation where editing pulses at $7.5ppm$, which have no effect on the metabolites of interest, are applied instead. The readout of the signal starts at $68ms$ as indicated. Experimentally, it can last over $1s$, depending on the dwell time and number of samples acquired. For a bandwidth of $2000Hz$, the dwell time is $0.5ms$ and acquiring $N=2048$ samples, a typical signal length, would therefore require $2048 \times 0.5ms = 1.024s$. The readout block in the diagram is truncated at $80ms$ for clarity, to show the RF pulses and timings. To account for imperfect slice profiles of the refocusing pulses, the spectra are simulated on a spatial grid (b) and the average over all positions is calculated.
  • Figure 2: Illustration of the Y-shaped autoencoder (YAE) architecture. The input consists of one or more noisy MEGA-PRESS spectra (OFF, ON, DIFF using real, imaginary or magnitude representations). The encoder maps the input to a compressed latent representation. The decoder branch reconstructs denoised versions of the input spectra from this latent space. The quantifier branch predicts the metabolite concentrations from the same latent representation. Key components are highlighted: ① hidden layer activation function, ② dropout layer, ③ decoder output activation function, and ④ quantifier output activation function.
  • Figure 3: Validation and training concentration MAE for the top $25$ of $432$ CNN configurations from the grid search model selection (Table \ref{['tab:cnn-simple-all']}). Each pair of horizontal bars corresponds to one configuration: blue, validation MAE; red, training MAE. Error bars show standard deviation across five-fold cross-validation. Configurations are ordered by validation MAE (best at top). Dataset: $10{,}000$ simulated spectra, sum normalisation, basis set linewidth $2Hz$, $100$ epochs.
  • Figure 4: Performance of $50$ repetitions of the Bayesian optimisation for the CNN model selection over the Bayesian optimisation iterations. The grey shaded area (right axis) shows the percentage of runs that selected the same best configuration as the full grid search ("converged").
  • Figure 5: Validation and training concentration MAE for YAE configurations from the final joint optimisation (Stage 3). Each pair of horizontal bars corresponds to one configuration: blue, validation MAE; red, training MAE. Error bars show standard deviation across five-fold cross-validation. Configurations are ordered by validation MAE (best at top). Dataset: $10{,}000$ simulated spectra, sum normalisation, $200$ epochs (Table \ref{['tab:refined_space']}).
  • ...and 4 more figures