Table of Contents
Fetching ...

MedIL: Implicit Latent Spaces for Generating Heterogeneous Medical Images at Arbitrary Resolutions

Tyler Spears, Shen Zhu, Yinzhu Jin, Aman Shrivastava, P. Thomas Fletcher

TL;DR

MedIL proposes implicit latent spaces via INRs to encode heterogeneous medical images without resampling, enabling arbitrary-resolution decoding. The method integrates an LTE-based Local Texture Estimator within a fully convolutional encoder-decoder, producing a latent representation defined on continuous coordinate grids. Evaluations on T1w brain MRIs and LIDC-IDRI lung CTs show that MedIL matches or exceeds fixed-size LDMs in reconstruction quality and can influence downstream diffusion-based generation, preserving clinical details across resolutions. The work provides a path toward more faithful generative modeling of raw clinical acquisitions and releases code for future spatially-continuous autoencoders.

Abstract

In this work, we introduce MedIL, a first-of-its-kind autoencoder built for encoding medical images with heterogeneous sizes and resolutions for image generation. Medical images are often large and heterogeneous, where fine details are of vital clinical importance. Image properties change drastically when considering acquisition equipment, patient demographics, and pathology, making realistic medical image generation challenging. Recent work in latent diffusion models (LDMs) has shown success in generating images resampled to a fixed-size. However, this is a narrow subset of the resolutions native to image acquisition, and resampling discards fine anatomical details. MedIL utilizes implicit neural representations to treat images as continuous signals, where encoding and decoding can be performed at arbitrary resolutions without prior resampling. We quantitatively and qualitatively show how MedIL compresses and preserves clinically-relevant features over large multi-site, multi-resolution datasets of both T1w brain MRIs and lung CTs. We further demonstrate how MedIL can influence the quality of images generated with a diffusion model, and discuss how MedIL can enhance generative models to resemble raw clinical acquisitions.

MedIL: Implicit Latent Spaces for Generating Heterogeneous Medical Images at Arbitrary Resolutions

TL;DR

MedIL proposes implicit latent spaces via INRs to encode heterogeneous medical images without resampling, enabling arbitrary-resolution decoding. The method integrates an LTE-based Local Texture Estimator within a fully convolutional encoder-decoder, producing a latent representation defined on continuous coordinate grids. Evaluations on T1w brain MRIs and LIDC-IDRI lung CTs show that MedIL matches or exceeds fixed-size LDMs in reconstruction quality and can influence downstream diffusion-based generation, preserving clinical details across resolutions. The work provides a path toward more faithful generative modeling of raw clinical acquisitions and releases code for future spatially-continuous autoencoders.

Abstract

In this work, we introduce MedIL, a first-of-its-kind autoencoder built for encoding medical images with heterogeneous sizes and resolutions for image generation. Medical images are often large and heterogeneous, where fine details are of vital clinical importance. Image properties change drastically when considering acquisition equipment, patient demographics, and pathology, making realistic medical image generation challenging. Recent work in latent diffusion models (LDMs) has shown success in generating images resampled to a fixed-size. However, this is a narrow subset of the resolutions native to image acquisition, and resampling discards fine anatomical details. MedIL utilizes implicit neural representations to treat images as continuous signals, where encoding and decoding can be performed at arbitrary resolutions without prior resampling. We quantitatively and qualitatively show how MedIL compresses and preserves clinically-relevant features over large multi-site, multi-resolution datasets of both T1w brain MRIs and lung CTs. We further demonstrate how MedIL can influence the quality of images generated with a diffusion model, and discuss how MedIL can enhance generative models to resemble raw clinical acquisitions.

Paper Structure

This paper contains 18 sections, 2 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: MedIL architecture. Input volumes $X$ are encoded into latent volumes $Z$, which can then be decoded to arbitrary output resolutions or orientations.
  • Figure 2: Reconstruction of real, native-space T1w brain MRIs.
  • Figure 3: Reconstruction of real, native-space lung CTs.
  • Figure 4: Example T1w MRIs generated with a DDPM on latent space samples.