Table of Contents
Fetching ...

MedShift: Implicit Conditional Transport for X-Ray Domain Adaptation

Francisco Caetano, Christiaan Viviers, Peter H. N. de With, Fons van der Sommen

Abstract

Synthetic medical data offers a scalable solution for training robust models, but significant domain gaps limit its generalizability to real-world clinical settings. This paper addresses the challenge of cross-domain translation between synthetic and real X-ray images of the head, focusing on bridging discrepancies in attenuation behavior, noise characteristics, and soft tissue representation. We propose MedShift, a unified class-conditional generative model based on Flow Matching and Schrodinger Bridges, which enables high-fidelity, unpaired image translation across multiple domains. Unlike prior approaches that require domain-specific training or rely on paired data, MedShift learns a shared domain-agnostic latent space and supports seamless translation between any pair of domains seen during training. We introduce X-DigiSkull, a new dataset comprising aligned synthetic and real skull X-rays under varying radiation doses, to benchmark domain translation models. Experimental results demonstrate that, despite its smaller model size compared to diffusion-based approaches, MedShift offers strong performance and remains flexible at inference time, as it can be tuned to prioritize either perceptual fidelity or structural consistency, making it a scalable and generalizable solution for domain adaptation in medical imaging. The code and dataset are available at https://caetas.github.io/medshift.html

MedShift: Implicit Conditional Transport for X-Ray Domain Adaptation

Abstract

Synthetic medical data offers a scalable solution for training robust models, but significant domain gaps limit its generalizability to real-world clinical settings. This paper addresses the challenge of cross-domain translation between synthetic and real X-ray images of the head, focusing on bridging discrepancies in attenuation behavior, noise characteristics, and soft tissue representation. We propose MedShift, a unified class-conditional generative model based on Flow Matching and Schrodinger Bridges, which enables high-fidelity, unpaired image translation across multiple domains. Unlike prior approaches that require domain-specific training or rely on paired data, MedShift learns a shared domain-agnostic latent space and supports seamless translation between any pair of domains seen during training. We introduce X-DigiSkull, a new dataset comprising aligned synthetic and real skull X-rays under varying radiation doses, to benchmark domain translation models. Experimental results demonstrate that, despite its smaller model size compared to diffusion-based approaches, MedShift offers strong performance and remains flexible at inference time, as it can be tuned to prioritize either perceptual fidelity or structural consistency, making it a scalable and generalizable solution for domain adaptation in medical imaging. The code and dataset are available at https://caetas.github.io/medshift.html

Paper Structure

This paper contains 25 sections, 2 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Overview of MedShift inference. A source image $x_1$ is first encoded into a domain-agnostic latent representation $z_{\tau}$. This latent lies near a shared manifold across all domains. Then, translation is performed by forward-time sampling conditioned on the target domain label to obtain the translated image $\hat{x}_1$.
  • Figure 2: Dataset overview. The synthetic domain contains Low and High dosage samples generated using the Mentice VIST$\textsuperscript{\textregistered}$ simulator; the real domain includes Low, Normal, and Exposure dosage categories acquired from a skull phantom using the Philips Azurion IGT system.
  • Figure 3: Trade-off between structural fidelity (SSIM) and realism (CFID) for the evaluated models.
  • Figure 4: UMAP visualization of the latent-space features for different $\tau$ levels.