Table of Contents
Fetching ...

Non-Adversarial Learning: Vector-Quantized Common Latent Space for Multi-Sequence MRI

Luyi Han, Tao Tan, Tianyu Zhang, Xin Wang, Yuan Gao, Chunyao Lu, Xinglong Liang, Haoran Dou, Yunzhi Huang, Ritse Mann

TL;DR

The paper addresses missing MRI sequences and the instability of GAN-based synthesis by introducing a non-adversarial framework that leverages a vector-quantized common latent space (VQC) for multi-sequence MRI. It fuses a VQ-VAE to obtain discrete latents $z_q$, estimates a Gaussian VQC latent space with mean $\mu_q(\mathcal{X})$ and variance $\sigma^2_q(\mathcal{X})$, and samples $z_s$ via $z_s = \mu_q(\mathcal{X}) + \epsilon \cdot \sigma^2_q(\mathcal{X})$ for generation through a dynamic Seq2Seq decoder. The approach combines uncertainty estimation, latent-space consistency with a contrastive loss, and random domain augmentation to enable robust cross-sequence generation without adversarial training. On BraTS2021, the method outperforms GAN-based baselines, demonstrates strong anti-interference to noise and bias fields, and shows promising one-shot segmentation capabilities, with the code publicly available.

Abstract

Adversarial learning helps generative models translate MRI from source to target sequence when lacking paired samples. However, implementing MRI synthesis with adversarial learning in clinical settings is challenging due to training instability and mode collapse. To address this issue, we leverage intermediate sequences to estimate the common latent space among multi-sequence MRI, enabling the reconstruction of distinct sequences from the common latent space. We propose a generative model that compresses discrete representations of each sequence to estimate the Gaussian distribution of vector-quantized common (VQC) latent space between multiple sequences. Moreover, we improve the latent space consistency with contrastive learning and increase model stability by domain augmentation. Experiments using BraTS2021 dataset show that our non-adversarial model outperforms other GAN-based methods, and VQC latent space aids our model to achieve (1) anti-interference ability, which can eliminate the effects of noise, bias fields, and artifacts, and (2) solid semantic representation ability, with the potential of one-shot segmentation. Our code is publicly available.

Non-Adversarial Learning: Vector-Quantized Common Latent Space for Multi-Sequence MRI

TL;DR

The paper addresses missing MRI sequences and the instability of GAN-based synthesis by introducing a non-adversarial framework that leverages a vector-quantized common latent space (VQC) for multi-sequence MRI. It fuses a VQ-VAE to obtain discrete latents , estimates a Gaussian VQC latent space with mean and variance , and samples via for generation through a dynamic Seq2Seq decoder. The approach combines uncertainty estimation, latent-space consistency with a contrastive loss, and random domain augmentation to enable robust cross-sequence generation without adversarial training. On BraTS2021, the method outperforms GAN-based baselines, demonstrates strong anti-interference to noise and bias fields, and shows promising one-shot segmentation capabilities, with the code publicly available.

Abstract

Adversarial learning helps generative models translate MRI from source to target sequence when lacking paired samples. However, implementing MRI synthesis with adversarial learning in clinical settings is challenging due to training instability and mode collapse. To address this issue, we leverage intermediate sequences to estimate the common latent space among multi-sequence MRI, enabling the reconstruction of distinct sequences from the common latent space. We propose a generative model that compresses discrete representations of each sequence to estimate the Gaussian distribution of vector-quantized common (VQC) latent space between multiple sequences. Moreover, we improve the latent space consistency with contrastive learning and increase model stability by domain augmentation. Experiments using BraTS2021 dataset show that our non-adversarial model outperforms other GAN-based methods, and VQC latent space aids our model to achieve (1) anti-interference ability, which can eliminate the effects of noise, bias fields, and artifacts, and (2) solid semantic representation ability, with the potential of one-shot segmentation. Our code is publicly available.
Paper Structure (24 sections, 8 equations, 5 figures, 3 tables)

This paper contains 24 sections, 8 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Overview of the proposed VQ-Seq2Seq framework.
  • Figure 2: Synthesis performance of VQ-Seq2Seq with different latent dimensions ($D$) and embedding dimensions ($K$). (a) Synthesis performance with different latent dimensions ($K=256$); (b) Synthesis performance with different embedding dimensions ($D=3$).
  • Figure 3: Visualization of translating T1 to T1Gd, T2, and Flair with a single step.
  • Figure 4: Visualization of reconstruction from input images with artifacts, noise, and bias field. Artifacts exist in the original images, therefore, the target image is unavailable.
  • Figure 5: One-shot segmentation performance of VQ-Seq2Seq with different embedding dimensions ($K$).