Table of Contents
Fetching ...

Pathology Image Compression with Pre-trained Autoencoders

Srikar Yellapragada, Alexandros Graikos, Kostas Triaridis, Zilinghan Li, Tarak Nath Nandi, Ravi K Madduri, Prateek Prasanna, Joel Saltz, Dimitris Samaras

TL;DR

This work tackles the data-storage bottleneck of high-resolution pathology images by repurposing Latent Diffusion Model autoencoders for learned compression. It benchmarks three AEs (SD-1.5, SD-3, DC-AE-f32) across compression levels, introduces decoder fine-tuning with a pathology-specific perceptual metric, and applies a K-means-based quantization to latent representations. The approach preserves performance on downstream tasks such as segmentation, patch classification, and multiple instance learning, while achieving storage savings that surpass JPEG at similar fidelity. Although decompression is slower than JPEG, the method enables scalable data sharing and supports training of pathology foundation models with larger, more diverse datasets.

Abstract

The growing volume of high-resolution Whole Slide Images in digital histopathology poses significant storage, transmission, and computational efficiency challenges. Standard compression methods, such as JPEG, reduce file sizes but often fail to preserve fine-grained phenotypic details critical for downstream tasks. In this work, we repurpose autoencoders (AEs) designed for Latent Diffusion Models as an efficient learned compression framework for pathology images. We systematically benchmark three AE models with varying compression levels and evaluate their reconstruction ability using pathology foundation models. We introduce a fine-tuning strategy to further enhance reconstruction fidelity that optimizes a pathology-specific learned perceptual metric. We validate our approach on downstream tasks, including segmentation, patch classification, and multiple instance learning, showing that replacing images with AE-compressed reconstructions leads to minimal performance degradation. Additionally, we propose a K-means clustering-based quantization method for AE latents, improving storage efficiency while maintaining reconstruction quality. We provide the weights of the fine-tuned autoencoders at https://huggingface.co/collections/StonyBrook-CVLab/pathology-fine-tuned-aes-67d45f223a659ff2e3402dd0.

Pathology Image Compression with Pre-trained Autoencoders

TL;DR

This work tackles the data-storage bottleneck of high-resolution pathology images by repurposing Latent Diffusion Model autoencoders for learned compression. It benchmarks three AEs (SD-1.5, SD-3, DC-AE-f32) across compression levels, introduces decoder fine-tuning with a pathology-specific perceptual metric, and applies a K-means-based quantization to latent representations. The approach preserves performance on downstream tasks such as segmentation, patch classification, and multiple instance learning, while achieving storage savings that surpass JPEG at similar fidelity. Although decompression is slower than JPEG, the method enables scalable data sharing and supports training of pathology foundation models with larger, more diverse datasets.

Abstract

The growing volume of high-resolution Whole Slide Images in digital histopathology poses significant storage, transmission, and computational efficiency challenges. Standard compression methods, such as JPEG, reduce file sizes but often fail to preserve fine-grained phenotypic details critical for downstream tasks. In this work, we repurpose autoencoders (AEs) designed for Latent Diffusion Models as an efficient learned compression framework for pathology images. We systematically benchmark three AE models with varying compression levels and evaluate their reconstruction ability using pathology foundation models. We introduce a fine-tuning strategy to further enhance reconstruction fidelity that optimizes a pathology-specific learned perceptual metric. We validate our approach on downstream tasks, including segmentation, patch classification, and multiple instance learning, showing that replacing images with AE-compressed reconstructions leads to minimal performance degradation. Additionally, we propose a K-means clustering-based quantization method for AE latents, improving storage efficiency while maintaining reconstruction quality. We provide the weights of the fine-tuned autoencoders at https://huggingface.co/collections/StonyBrook-CVLab/pathology-fine-tuned-aes-67d45f223a659ff2e3402dd0.

Paper Structure

This paper contains 10 sections, 2 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Examples of image reconstruction using JPEG, the vanilla DC-AE chen2024deep and fine-tuned DC-AE. JPEG at quality 10, with a comparable file size to DC-AE, introduces severe compression artifacts, including deformed nuclei and blocky artifacts (highlighted in yellow and teal). Vanilla DC-AE fails to retain certain cell structures (green), which are largely recovered through our fine-tuning strategy.
  • Figure 2: Left: Pre-trained autoencoders outperform JPEG in reconstruction fidelity, further improved by fine-tuning with a pathology-specific perceptual loss. Right: Using fine-tuned AE-compressed reconstructions results in minimal performance degradation. The width of each bar denotes the relative sizes of the compressed representation.