Table of Contents
Fetching ...

Flow Matching for Medical Image Synthesis: Bridging the Gap Between Speed and Quality

Milad Yazdani, Yasamin Medghalchi, Pooria Ashrafian, Ilker Hacihaliloglu, Dena Shahriari

TL;DR

This paper tackles the challenge of data scarcity and slow sampling in medical image synthesis by proposing Medical Optimal Transport Flow Matching (MOTFM), which learns a velocity field to transport samples from a noise source to a target image along a near-straight path. By formulating the problem as optimal transport flow matching, MOTFM achieves substantially faster inference than diffusion models while delivering equal or superior image quality, and it supports unconditional, class-conditioned, and mask-conditioned generation across 2D and 3D modalities with end-to-end training. Empirical results on CAMUS echocardiography and 3D MSD MRI Brain data show MOTFM consistently outperforms diffusion baselines in quality metrics and inference speed, with strong gains in downstream classification and segmentation tasks as well as a demonstrated denoising capability. Overall, MOTFM offers a practical, flexible, and scalable alternative to diffusion for medical image generation, with broad potential applications in data augmentation, image-to-image translation, and enhancement.

Abstract

Deep learning models have emerged as a powerful tool for various medical applications. However, their success depends on large, high-quality datasets that are challenging to obtain due to privacy concerns and costly annotation. Generative models, such as diffusion models, offer a potential solution by synthesizing medical images, but their practical adoption is hindered by long inference times. In this paper, we propose the use of an optimal transport flow matching approach to accelerate image generation. By introducing a straighter mapping between the source and target distribution, our method significantly reduces inference time while preserving and further enhancing the quality of the outputs. Furthermore, this approach is highly adaptable, supporting various medical imaging modalities, conditioning mechanisms (such as class labels and masks), and different spatial dimensions, including 2D and 3D. Beyond image generation, it can also be applied to related tasks such as image enhancement. Our results demonstrate the efficiency and versatility of this framework, making it a promising advancement for medical imaging applications. Code with checkpoints and a synthetic dataset (beneficial for classification and segmentation) is now available on: https://github.com/milad1378yz/MOTFM.

Flow Matching for Medical Image Synthesis: Bridging the Gap Between Speed and Quality

TL;DR

This paper tackles the challenge of data scarcity and slow sampling in medical image synthesis by proposing Medical Optimal Transport Flow Matching (MOTFM), which learns a velocity field to transport samples from a noise source to a target image along a near-straight path. By formulating the problem as optimal transport flow matching, MOTFM achieves substantially faster inference than diffusion models while delivering equal or superior image quality, and it supports unconditional, class-conditioned, and mask-conditioned generation across 2D and 3D modalities with end-to-end training. Empirical results on CAMUS echocardiography and 3D MSD MRI Brain data show MOTFM consistently outperforms diffusion baselines in quality metrics and inference speed, with strong gains in downstream classification and segmentation tasks as well as a demonstrated denoising capability. Overall, MOTFM offers a practical, flexible, and scalable alternative to diffusion for medical image generation, with broad potential applications in data augmentation, image-to-image translation, and enhancement.

Abstract

Deep learning models have emerged as a powerful tool for various medical applications. However, their success depends on large, high-quality datasets that are challenging to obtain due to privacy concerns and costly annotation. Generative models, such as diffusion models, offer a potential solution by synthesizing medical images, but their practical adoption is hindered by long inference times. In this paper, we propose the use of an optimal transport flow matching approach to accelerate image generation. By introducing a straighter mapping between the source and target distribution, our method significantly reduces inference time while preserving and further enhancing the quality of the outputs. Furthermore, this approach is highly adaptable, supporting various medical imaging modalities, conditioning mechanisms (such as class labels and masks), and different spatial dimensions, including 2D and 3D. Beyond image generation, it can also be applied to related tasks such as image enhancement. Our results demonstrate the efficiency and versatility of this framework, making it a promising advancement for medical imaging applications. Code with checkpoints and a synthetic dataset (beneficial for classification and segmentation) is now available on: https://github.com/milad1378yz/MOTFM.

Paper Structure

This paper contains 6 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: a) The figure illustrates transitions from source $x_0$ (blue) to target $x_1$ (red). Diffusion models map noisy samples to targets (magenta), while flow matching provides a more efficient path (cyan for training, dashed orange for inference). The contour map represents probability density. b) MOTFM framework with different conditioning strategies.
  • Figure 2: Comparison of echocardiographic and brain MRI synthesis using DDPM and MOTFM, with SPADE and ControlNet applied only to echocardiography. The first two rows show echocardiographic images, while the last row presents brain MRI synthesis, with numbers in parentheses indicating inference steps.
  • Figure 3: KDE Plot of Pixel Intensity Distributions for Generated and Real Echo Images.
  • Figure 4: Denoising Example: From a Noisy Image to a Denoised Image in 10 Steps