End-to-end autoencoding architecture for the simultaneous generation of medical images and corresponding segmentation masks

Aghiles Kebaili; Jérôme Lapuyade-Lahorgue; Pierre Vera; Su Ruan

End-to-end autoencoding architecture for the simultaneous generation of medical images and corresponding segmentation masks

Aghiles Kebaili, Jérôme Lapuyade-Lahorgue, Pierre Vera, Su Ruan

TL;DR

The paper tackles data scarcity in medical image segmentation by proposing an end-to-end Hamiltonian Variational Autoencoder (HVAE) that jointly generates medical images and tumor masks. It introduces a Hamiltonian Monte Carlo–based posterior sampling framework and a joint ELBO that models the paired data, enabling realistic, paired image–mask synthesis. Experimental results on BRATS and HECKTOR show HVAE-based augmentation outperforms vanilla VAE and LSGAN in data-scarce regimes, with higher DSC and improved image quality (PSNR/SSIM), particularly for ~300 synthetic samples. This work demonstrates that end-to-end joint generation can meaningfully boost segmentation performance in limited-data settings and points to future hybridizations with adversarial learning and latent-space geometry exploration.

Abstract

Despite the increasing use of deep learning in medical image segmentation, acquiring sufficient training data remains a challenge in the medical field. In response, data augmentation techniques have been proposed; however, the generation of diverse and realistic medical images and their corresponding masks remains a difficult task, especially when working with insufficient training sets. To address these limitations, we present an end-to-end architecture based on the Hamiltonian Variational Autoencoder (HVAE). This approach yields an improved posterior distribution approximation compared to traditional Variational Autoencoders (VAE), resulting in higher image generation quality. Our method outperforms generative adversarial architectures under data-scarce conditions, showcasing enhancements in image quality and precise tumor mask synthesis. We conduct experiments on two publicly available datasets, MICCAI's Brain Tumor Segmentation Challenge (BRATS), and Head and Neck Tumor Segmentation Challenge (HECKTOR), demonstrating the effectiveness of our method on different medical imaging modalities.

End-to-end autoencoding architecture for the simultaneous generation of medical images and corresponding segmentation masks

TL;DR

Abstract

Paper Structure (9 sections, 4 equations, 2 figures, 2 tables)

This paper contains 9 sections, 4 equations, 2 figures, 2 tables.

Introduction
Proposed method
The Hamiltonian VAE
Modeling the joint distribution for the simultaneous medical images and masks generation
EXPERIMENTS
Datasets
Training settings
Evaluation results
Conclusion

Figures (2)

Figure 1: Proposed architecture for medical image segmentation consists of an encoder $q_\phi$ and decoder $p_\theta$ network. It takes as concatenated multi-channel input the medical image and its corresponding tumor mask noted $\{\mathbf{x}, m\}$ and reconstructs it into $\{\mathbf{x}_r, m_r\}$ during training phase. Newly generated image pairs, denoted as ${\mathbf{x}_g, m_g}$, are produced during inference by feeding the decoder with a random Gaussian noise vector $z \sim \mathcal{N}(\mathbf{0}, \mathbf{I})$
Figure 2: Comparison of generated images of two state-of-the-art methods with our proposed ones through 4 examples represented in columns (HVAE with the generated tumor masks in the last row).

End-to-end autoencoding architecture for the simultaneous generation of medical images and corresponding segmentation masks

TL;DR

Abstract

End-to-end autoencoding architecture for the simultaneous generation of medical images and corresponding segmentation masks

Authors

TL;DR

Abstract

Table of Contents

Figures (2)