Table of Contents
Fetching ...

Leapfrog Latent Consistency Model (LLCM) for Medical Images Generation

Lakshmikar R. Polamreddy, Kalyan Roy, Sheng-Han Yueh, Deepshikha Mahato, Shilpa Kuppili, Jialu Li, Youshan Zhang

TL;DR

A Leapfrog Latent Consistency Model (LLCM), a large vision model for the generation of medical images that is distilled from a retrained diffusion model based on the collected MedImgs dataset, which enables the model to generate real-time high-resolution images.

Abstract

The scarcity of accessible medical image data poses a significant obstacle in effectively training deep learning models for medical diagnosis, as hospitals refrain from sharing their data due to privacy concerns. In response, we gathered a diverse dataset named MedImgs, which comprises over 250,127 images spanning 61 disease types and 159 classes of both humans and animals from open-source repositories. We propose a Leapfrog Latent Consistency Model (LLCM) that is distilled from a retrained diffusion model based on the collected MedImgs dataset, which enables our model to generate real-time high-resolution images. We formulate the reverse diffusion process as a probability flow ordinary differential equation (PF-ODE) and solve it in latent space using the Leapfrog algorithm. This formulation enables rapid sampling without necessitating additional iterations. Our model demonstrates state-of-the-art performance in generating medical images. Furthermore, our model can be fine-tuned with any custom medical image datasets, facilitating the generation of a vast array of images. Our experimental results outperform those of existing models on unseen dog cardiac X-ray images. Source code is available at https://github.com/lskdsjy/LeapfrogLCM.

Leapfrog Latent Consistency Model (LLCM) for Medical Images Generation

TL;DR

A Leapfrog Latent Consistency Model (LLCM), a large vision model for the generation of medical images that is distilled from a retrained diffusion model based on the collected MedImgs dataset, which enables the model to generate real-time high-resolution images.

Abstract

The scarcity of accessible medical image data poses a significant obstacle in effectively training deep learning models for medical diagnosis, as hospitals refrain from sharing their data due to privacy concerns. In response, we gathered a diverse dataset named MedImgs, which comprises over 250,127 images spanning 61 disease types and 159 classes of both humans and animals from open-source repositories. We propose a Leapfrog Latent Consistency Model (LLCM) that is distilled from a retrained diffusion model based on the collected MedImgs dataset, which enables our model to generate real-time high-resolution images. We formulate the reverse diffusion process as a probability flow ordinary differential equation (PF-ODE) and solve it in latent space using the Leapfrog algorithm. This formulation enables rapid sampling without necessitating additional iterations. Our model demonstrates state-of-the-art performance in generating medical images. Furthermore, our model can be fine-tuned with any custom medical image datasets, facilitating the generation of a vast array of images. Our experimental results outperform those of existing models on unseen dog cardiac X-ray images. Source code is available at https://github.com/lskdsjy/LeapfrogLCM.

Paper Structure

This paper contains 10 sections, 16 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: Comparison of LLCM different steps generated medical images of (512$\times$512) resolution in different inference steps against original images.
  • Figure 2: Flowchart of Image Generation with Leapfrog Latent Consistency Model.
  • Figure 3: Comparison of LLCM generated medical images of (512$\times$512) resolution of the Alzheimer’s mild demented category in humans for different leapfrog jumping steps with 4-step inference.
  • Figure 4: Comparison results of unseen large dog heart X-ray images generated by various models.
  • Figure 5: Generated images results with our LLCM model of (512$\times$512) resolution with 4-step inference.