Table of Contents
Fetching ...

Discriminative Hamiltonian Variational Autoencoder for Accurate Tumor Segmentation in Data-Scarce Regimes

Aghiles Kebaili, Jérôme Lapuyade-Lahorgue, Pierre Vera, Su Ruan

TL;DR

This work tackles data scarcity in medical tumor segmentation by introducing a discriminative regularized Hamiltonian VAE (dHVAE) that jointly models image and tumor mask distributions as $p_ heta(x,m|z)$ and synthesizes image–mask pairs in a single pass. It combines a perceptual and pixel-wise feature reconstruction loss with a small-weight adversarial regularization, enabling realistic, diverse samples while preserving mode coverage in limited-data settings. The method employs a slice-by-slice 2D-to-3D augmentation strategy within a four-block encoder–decoder HVAE architecture and demonstrates significant improvements in downstream Dice scores on BRATS (MRI) and HECKTOR (PET) datasets compared to traditional augmentation and several generative baselines. This approach offers a practical, data-efficient path to improve tumor segmentation in clinical scenarios where annotated data are scarce, with potential extensions to quantum-inspired latent-density formulations.

Abstract

Deep learning has gained significant attention in medical image segmentation. However, the limited availability of annotated training data presents a challenge to achieving accurate results. In efforts to overcome this challenge, data augmentation techniques have been proposed. However, the majority of these approaches primarily focus on image generation. For segmentation tasks, providing both images and their corresponding target masks is crucial, and the generation of diverse and realistic samples remains a complex task, especially when working with limited training datasets. To this end, we propose a new end-to-end hybrid architecture based on Hamiltonian Variational Autoencoders (HVAE) and a discriminative regularization to improve the quality of generated images. Our method provides an accuracte estimation of the joint distribution of the images and masks, resulting in the generation of realistic medical images with reduced artifacts and off-distribution instances. As generating 3D volumes requires substantial time and memory, our architecture operates on a slice-by-slice basis to segment 3D volumes, capitilizing on the richly augmented dataset. Experiments conducted on two public datasets, BRATS (MRI modality) and HECKTOR (PET modality), demonstrate the efficacy of our proposed method on different medical imaging modalities with limited data.

Discriminative Hamiltonian Variational Autoencoder for Accurate Tumor Segmentation in Data-Scarce Regimes

TL;DR

This work tackles data scarcity in medical tumor segmentation by introducing a discriminative regularized Hamiltonian VAE (dHVAE) that jointly models image and tumor mask distributions as and synthesizes image–mask pairs in a single pass. It combines a perceptual and pixel-wise feature reconstruction loss with a small-weight adversarial regularization, enabling realistic, diverse samples while preserving mode coverage in limited-data settings. The method employs a slice-by-slice 2D-to-3D augmentation strategy within a four-block encoder–decoder HVAE architecture and demonstrates significant improvements in downstream Dice scores on BRATS (MRI) and HECKTOR (PET) datasets compared to traditional augmentation and several generative baselines. This approach offers a practical, data-efficient path to improve tumor segmentation in clinical scenarios where annotated data are scarce, with potential extensions to quantum-inspired latent-density formulations.

Abstract

Deep learning has gained significant attention in medical image segmentation. However, the limited availability of annotated training data presents a challenge to achieving accurate results. In efforts to overcome this challenge, data augmentation techniques have been proposed. However, the majority of these approaches primarily focus on image generation. For segmentation tasks, providing both images and their corresponding target masks is crucial, and the generation of diverse and realistic samples remains a complex task, especially when working with limited training datasets. To this end, we propose a new end-to-end hybrid architecture based on Hamiltonian Variational Autoencoders (HVAE) and a discriminative regularization to improve the quality of generated images. Our method provides an accuracte estimation of the joint distribution of the images and masks, resulting in the generation of realistic medical images with reduced artifacts and off-distribution instances. As generating 3D volumes requires substantial time and memory, our architecture operates on a slice-by-slice basis to segment 3D volumes, capitilizing on the richly augmented dataset. Experiments conducted on two public datasets, BRATS (MRI modality) and HECKTOR (PET modality), demonstrate the efficacy of our proposed method on different medical imaging modalities with limited data.
Paper Structure (23 sections, 15 equations, 6 figures, 6 tables)

This paper contains 23 sections, 15 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Proposed end-to-end architecture consisting of an encoder $q_\phi$ and decoder $p_\theta$ network, a discriminative regularizer $D_\theta$ , and a pre-trained 16-layer $\mathbf{vgg}$-based architecture. It takes as concatenated multi-channel input the medical image and its corresponding tumor mask noted $\{\mathbf{x}, m\}$ and reconstructs it into $\{\mathbf{x}_r, m_r\}$. Newly generated image pairs are produced by feeding the decoder with a random Gaussian noise vector $z \sim \mathcal{N}(\mathbf{0}, \mathbf{I})$. Further details about the loss computation module can be found in section \ref{['sec3']}.
  • Figure 2: Detailed illustration of our proposed Hamiltonian autoencoding architecture for medical image and mask generation. More details are specified in section 3.6.
  • Figure 3: Comparison of generated images and masks using our method on the BRATS dataset with varying hyperparameter $\beta$. To gain deeper insights into the image quality and blurriness, we provide a magnified view of the borders, presenting additional details about these specific regions. Each column represent an example of a generated pair of MRI and its corresponding mask.
  • Figure 4: Comparison of images generated our proposed method and other generative models on the BRATS and HECKTOR datasets. Each column represents an example of synthetic image produced for both datasets. To provide a closer examination of the tumoral regions texture, a zoomed-in rectangle is presented, offering additional insights into the fine details of these regions.
  • Figure 5: Illustration of the augmentation pipeline for a generative-model-based data augmentation. Adapted from kebaili2023deep
  • ...and 1 more figures