Super-resolution of biomedical volumes with 2D supervision

Cheng Jiang; Alexander Gedeon; Yiwei Lyu; Eric Landgraf; Yufeng Zhang; Xinhai Hou; Akhil Kondepudi; Asadur Chowdury; Honglak Lee; Todd Hollon

Super-resolution of biomedical volumes with 2D supervision

Cheng Jiang, Alexander Gedeon, Yiwei Lyu, Eric Landgraf, Yufeng Zhang, Xinhai Hou, Akhil Kondepudi, Asadur Chowdury, Honglak Lee, Todd Hollon

TL;DR

This work tackles the challenge of obtaining high-resolution 3D biomedical volumes by leveraging abundant 2D high-resolution microscopy data. It introduces masked slice diffusion for super-resolution (MSDSR), a conditional diffusion approach trained on 2D slices and capable of reconstructing isotropic 3D volumes without 3D supervision. A new evaluation metric, SliceFID, assesses volumetric quality by averaging per-axis 2D FID scores. Across experiments on stimulated Raman histology data, MSDSR outperforms interpolation and UNet baselines in both 2D and 3D evaluations, demonstrating improved fidelity and cross-plane consistency. The approach promises to reduce imaging time and data collection burdens while enabling more accurate, scalable 3D analyses in clinical and research settings.

Abstract

Volumetric biomedical microscopy has the potential to increase the diagnostic information extracted from clinical tissue specimens and improve the diagnostic accuracy of both human pathologists and computational pathology models. Unfortunately, barriers to integrating 3-dimensional (3D) volumetric microscopy into clinical medicine include long imaging times, poor depth / z-axis resolution, and an insufficient amount of high-quality volumetric data. Leveraging the abundance of high-resolution 2D microscopy data, we introduce masked slice diffusion for super-resolution (MSDSR), which exploits the inherent equivalence in the data-generating distribution across all spatial dimensions of biological specimens. This intrinsic characteristic allows for super-resolution models trained on high-resolution images from one plane (e.g., XY) to effectively generalize to others (XZ, YZ), overcoming the traditional dependency on orientation. We focus on the application of MSDSR to stimulated Raman histology (SRH), an optical imaging modality for biological specimen analysis and intraoperative diagnosis, characterized by its rapid acquisition of high-resolution 2D images but slow and costly optical z-sectioning. To evaluate MSDSR's efficacy, we introduce a new performance metric, SliceFID, and demonstrate MSDSR's superior performance over baseline models through extensive evaluations. Our findings reveal that MSDSR not only significantly enhances the quality and resolution of 3D volumetric data, but also addresses major obstacles hindering the broader application of 3D volumetric microscopy in clinical diagnostics and biomedical research.

Super-resolution of biomedical volumes with 2D supervision

TL;DR

Abstract

Paper Structure (25 sections, 7 equations, 5 figures, 4 tables, 1 algorithm)

This paper contains 25 sections, 7 equations, 5 figures, 4 tables, 1 algorithm.

Introduction
Related Work
Denoising diffusion models
Image super-resolution
Super-resolution for biomedical imaging
Deep Learning applications in SRH
Methods
Masked slice diffusion
Volume super-resolution inference
Experimentation
Data description
Implementation details
MSDSR architecture and training.
MS-UNet baseline.
End-to-end (E2E) UNet baseline.
...and 10 more sections

Figures (5)

Figure 1: Super-resolution of biomedical volumes with 2D supervision. Volumetric microscopy images have a data distribution agnostic to tissue orientation and spatial dimension. Here, we present a method for leveraging this intrinsic characteristic by super-resolving low-resolution volumes using a conditional diffusion model trained on 2D high-resolution images.
Figure 2: MSDSR overview. A. MSDSR is trained with a diffusion network conditioned on row-wise masks of the ground-truth high-resolution image. During the reverse diffusion process, a random masking ratio from 1/2 to 1/8 introduces these rows at random locations to give contextual structure when de-noising. The model then learns to interpolate the noised data in between the mask to produce a high-fidelity 2D image. B. During 3D inference, the low-resolution z-stack volume is sliced in both the XZ and YZ dimensions, producing low-resolution 2D images. The rows of these images are then treated as an evenly spaced mask interlacing random noise when individually fed into the model. These mixtures are then up-scaled by the model to produce high-resolution volumes and are then averaged together to generate a restored isotropic z-stack.
Figure 3: Paired 2D evaluation. We compare the images generated by MSDSR and other baselines to the paired ground truth image. # cond rows, number of conditioning rows, NN, nearest neighbor, bilinear, bilinear interpolation.
Figure 4: 3D super-resolution results. We compare 3D volumetric super-resolution inference across three different input scalings. NN, nearest neighbor, bilinear, bilinear interpolation, E2E UNet, end-to-end UNet.
Figure 5: Ablation study on inference direction and Gaussian blur. We compare 3D volumetric super-resolution inference using ablated models across three different input scalings. NN, nearest neighbor, average, averaging XZ and YZ inference.

Super-resolution of biomedical volumes with 2D supervision

TL;DR

Abstract

Super-resolution of biomedical volumes with 2D supervision

Authors

TL;DR

Abstract

Table of Contents

Figures (5)