Table of Contents
Fetching ...

Unsupervised Domain Transfer with Conditional Invertible Neural Networks

Kris K. Dreher, Leonardo Ayala, Melanie Schellenberg, Marco Hübner, Jan-Hinrich Nölke, Tim J. Adler, Silvia Seidlitz, Jan Sellner, Alexander Studier-Fischer, Janek Gröhl, Felix Nickel, Ullrich Köthe, Alexander Seitel, Lena Maier-Hein

TL;DR

The paper addresses the domain shift between physics-based simulations and real spectral data in medical imaging. It introduces a sim-to-real transfer method based on conditional invertible neural networks (cINNs) that enforces cycle-consistency and enables maximum likelihood training. The method is applied to two modalities, photoacoustic tomography (PAT) and hyperspectral imaging (HSI), demonstrating reduced spectral domain gap and improved downstream classification (artery/vein in PAT and organ in HSI) compared with UNIT and simulated data. The results indicate that cINN-based domain transfer can produce more realistic synthetic spectral data and enhance training for downstream tasks when labeled real data are scarce, with potential applicability beyond spectral imaging.

Abstract

Synthetic medical image generation has evolved as a key technique for neural network training and validation. A core challenge, however, remains in the domain gap between simulations and real data. While deep learning-based domain transfer using Cycle Generative Adversarial Networks and similar architectures has led to substantial progress in the field, there are use cases in which state-of-the-art approaches still fail to generate training images that produce convincing results on relevant downstream tasks. Here, we address this issue with a domain transfer approach based on conditional invertible neural networks (cINNs). As a particular advantage, our method inherently guarantees cycle consistency through its invertible architecture, and network training can efficiently be conducted with maximum likelihood training. To showcase our method's generic applicability, we apply it to two spectral imaging modalities at different scales, namely hyperspectral imaging (pixel-level) and photoacoustic tomography (image-level). According to comprehensive experiments, our method enables the generation of realistic spectral data and outperforms the state of the art on two downstream classification tasks (binary and multi-class). cINN-based domain transfer could thus evolve as an important method for realistic synthetic data generation in the field of spectral imaging and beyond.

Unsupervised Domain Transfer with Conditional Invertible Neural Networks

TL;DR

The paper addresses the domain shift between physics-based simulations and real spectral data in medical imaging. It introduces a sim-to-real transfer method based on conditional invertible neural networks (cINNs) that enforces cycle-consistency and enables maximum likelihood training. The method is applied to two modalities, photoacoustic tomography (PAT) and hyperspectral imaging (HSI), demonstrating reduced spectral domain gap and improved downstream classification (artery/vein in PAT and organ in HSI) compared with UNIT and simulated data. The results indicate that cINN-based domain transfer can produce more realistic synthetic spectral data and enhance training for downstream tasks when labeled real data are scarce, with potential applicability beyond spectral imaging.

Abstract

Synthetic medical image generation has evolved as a key technique for neural network training and validation. A core challenge, however, remains in the domain gap between simulations and real data. While deep learning-based domain transfer using Cycle Generative Adversarial Networks and similar architectures has led to substantial progress in the field, there are use cases in which state-of-the-art approaches still fail to generate training images that produce convincing results on relevant downstream tasks. Here, we address this issue with a domain transfer approach based on conditional invertible neural networks (cINNs). As a particular advantage, our method inherently guarantees cycle consistency through its invertible architecture, and network training can efficiently be conducted with maximum likelihood training. To showcase our method's generic applicability, we apply it to two spectral imaging modalities at different scales, namely hyperspectral imaging (pixel-level) and photoacoustic tomography (image-level). According to comprehensive experiments, our method enables the generation of realistic spectral data and outperforms the state of the art on two downstream classification tasks (binary and multi-class). cINN-based domain transfer could thus evolve as an important method for realistic synthetic data generation in the field of spectral imaging and beyond.
Paper Structure (6 sections, 4 equations, 9 figures, 3 tables)

This paper contains 6 sections, 4 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Pipeline for data-driven spectral image analysis in the absence of labeled reference data. A physics-based simulation framework generates simulated spectral images with corresponding reference labels (e.g., tissue type or oxygenation (sO$_2$)). Our domain transfer method based on cINNs leverages unlabeled real data to increase their realism. The domain-transferred data can then be used for supervised training of a downstream task (e.g. classification).
  • Figure 2: Proposed architecture based on cINNs. The invertible architecture transfers both simulated and real data into a shared latent space (right). By conditioning on the domain D (bottom), a latent vector can be transferred to either the simulated or the real domain (left) for which the discriminator $\text{Dis}_\text{sim}$ and $\text{Dis}_\text{real}$ calculate the losses for adversarial training.
  • Figure 3: Training data used for the validation experiments. For PAT, 960 real images from 30 volunteers were acquired. For HSI, more than six million spectra corresponding to 460 images and 20 individuals were used. The tissue labels PAT correspond to 2D semantic segmentations, whereas the tissue labels for HSI represent 10 different organs. For PAT, $\sim$ 1600 images were simulated, whereas around 210,000 spectra were simulated for HSI.
  • Figure 4: Qualitative results. In comparison to simulated PAT images (left), images generated by the cINN (middle) resemble real PAT images (right) more closely. All images show a human forearm at 800 nm.
  • Figure 5: Our domain transfer approach yields realistic spectra (here: of veins). The PCA plots in a) represent a kernel density estimation of the first and second components of a PCA embedding of the real data, which represent about 67% and 6% of the variance in the real data, respectively. The distributions on top and on the right of the PCA plot correspond to the marginal distributions of each dataset’s first two components. b) Violin plots show that the cINN yields spectra that feature a smaller difference to the real data compared to the simulations and the UNIT-generated data. The dashed lines represent the mean difference value, and each dot represents the difference for one wavelength.
  • ...and 4 more figures