Table of Contents
Fetching ...

Gaussian Process Emulators for Few-Shot Segmentation in Cardiac MRI

Bruno Viti, Franz Thaler, Kathrin Lisa Kapper, Martin Urschler, Martin Holler, Elias Karabelas

TL;DR

This work tackles data scarcity and orientation generalization in cardiac MRI segmentation by integrating Gaussian Process Emulators (GPEs) with a U-Net architecture for few-shot, multi-label segmentation. The model encodes query and support images into a latent space, learns a mapping from support features to masks via GPEs with a squared exponential kernel, and incorporates the resulting posterior mean into the decoder alongside query features, using skip-connections at multiple levels. Evaluated on the M&Ms-2 dataset with training on short-axis (SA) slices and testing on long-axis (LA) orientations, the method achieves higher Dice scores than state-of-the-art unsupervised and few-shot baselines, particularly in 1–2 shot settings, with improvements further amplified as the support set grows. The approach reduces labeling needs and enables adaptable segmentation across cardiac orientations, with future work focusing on uncertainty quantification and conditional variance within the GPE component.

Abstract

Segmentation of cardiac magnetic resonance images (MRI) is crucial for the analysis and assessment of cardiac function, helping to diagnose and treat various cardiovascular diseases. Most recent techniques rely on deep learning and usually require an extensive amount of labeled data. To overcome this problem, few-shot learning has the capability of reducing data dependency on labeled data. In this work, we introduce a new method that merges few-shot learning with a U-Net architecture and Gaussian Process Emulators (GPEs), enhancing data integration from a support set for improved performance. GPEs are trained to learn the relation between the support images and the corresponding masks in latent space, facilitating the segmentation of unseen query images given only a small labeled support set at inference. We test our model with the M&Ms-2 public dataset to assess its ability to segment the heart in cardiac magnetic resonance imaging from different orientations, and compare it with state-of-the-art unsupervised and few-shot methods. Our architecture shows higher DICE coefficients compared to these methods, especially in the more challenging setups where the size of the support set is considerably small.

Gaussian Process Emulators for Few-Shot Segmentation in Cardiac MRI

TL;DR

This work tackles data scarcity and orientation generalization in cardiac MRI segmentation by integrating Gaussian Process Emulators (GPEs) with a U-Net architecture for few-shot, multi-label segmentation. The model encodes query and support images into a latent space, learns a mapping from support features to masks via GPEs with a squared exponential kernel, and incorporates the resulting posterior mean into the decoder alongside query features, using skip-connections at multiple levels. Evaluated on the M&Ms-2 dataset with training on short-axis (SA) slices and testing on long-axis (LA) orientations, the method achieves higher Dice scores than state-of-the-art unsupervised and few-shot baselines, particularly in 1–2 shot settings, with improvements further amplified as the support set grows. The approach reduces labeling needs and enables adaptable segmentation across cardiac orientations, with future work focusing on uncertainty quantification and conditional variance within the GPE component.

Abstract

Segmentation of cardiac magnetic resonance images (MRI) is crucial for the analysis and assessment of cardiac function, helping to diagnose and treat various cardiovascular diseases. Most recent techniques rely on deep learning and usually require an extensive amount of labeled data. To overcome this problem, few-shot learning has the capability of reducing data dependency on labeled data. In this work, we introduce a new method that merges few-shot learning with a U-Net architecture and Gaussian Process Emulators (GPEs), enhancing data integration from a support set for improved performance. GPEs are trained to learn the relation between the support images and the corresponding masks in latent space, facilitating the segmentation of unseen query images given only a small labeled support set at inference. We test our model with the M&Ms-2 public dataset to assess its ability to segment the heart in cardiac magnetic resonance imaging from different orientations, and compare it with state-of-the-art unsupervised and few-shot methods. Our architecture shows higher DICE coefficients compared to these methods, especially in the more challenging setups where the size of the support set is considerably small.

Paper Structure

This paper contains 13 sections, 6 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Overview of a training episode of the proposed method. The encoder $E_{\chi}$ encodes the MRIs of the query and the support set, while $E_{\Upsilon}$ encodes the support masks. We train the GPE with the support's features, $\mathbf{x}_S^d$ and $\mathbf{y}_\mathrm{S}^d$. Then, given a new point $\mathbf{x}_\mathrm{Q}^d$, we infer the mean $\bm{\mu}_{\mathrm{Q}\vert \mathrm{S}}^d$. This additional information is then passed to the decoder $D_{\zeta}$ similar to a skip-connection, and the mask of the query input image is predicted.
  • Figure 2: Average DICE scores ($\%$) $\pm$ one standard deviation of our model for an increasing support set from 1 to 10 images.
  • Figure 3: Qualitative results of our method with different accuracies. On the left, we present a prediction that satisfies setting 1 but not settings 2 and 3, in comparison to the corresponding ground truth. In the center, a prediction meets setting 2 but not setting 3. On the right, the prediction fulfills setting 3.
  • Figure 4: Qualitative results of our method for an LA slice under 1-Shot (first row) and 2-Shot (second row). From left to right we show: MRI to segment, the support set used in the test phase, nnU-Net prediction, our prediction, ground truth.