Towards 1000-fold Electron Microscopy Image Compression for Connectomics via VQ-VAE with Transformer Prior

Fuming Yang; Yicong Li; Hanspeter Pfister; Jeff W. Lichtman; Yaron Meirovitch

Towards 1000-fold Electron Microscopy Image Compression for Connectomics via VQ-VAE with Transformer Prior

Fuming Yang, Yicong Li, Hanspeter Pfister, Jeff W. Lichtman, Yaron Meirovitch

TL;DR

The paper tackles the storage and analysis bottlenecks of petascale EM data by introducing a vector-quantized variational autoencoder (VQ-VAE) with a Transformer prior that enables pay-as-you-decode compression from $16\\times$ to $1024\\times$, while preserving neuronal structures. It presents a two-level VQ-VAE architecture with top and bottom token latents, FiLM-based fusion, and an ROI-driven workflow for selective high-resolution reconstruction, achieving competitive SSIM and robust downstream task performance across datasets. The work demonstrates near-parity with AVIF at moderate compression and strong maintenance of segmentation and synapse detection at extreme ratios, plus a practical mechanism to extract high-resolution regions on demand. This approach lays the groundwork for a foundation-model-like, token-based EM compression framework that can generalize across connectomic datasets and support scalable, on-demand analysis.

Abstract

Petascale electron microscopy (EM) datasets push storage, transfer, and downstream analysis toward their current limits. We present a vector-quantized variational autoencoder-based (VQ-VAE) compression framework for EM that spans 16x to 1024x and enables pay-as-you-decode usage: top-only decoding for extreme compression, with an optional Transformer prior that predicts bottom tokens (without changing the compression ratio) to restore texture via feature-wise linear modulation (FiLM) and concatenation; we further introduce an ROI-driven workflow that performs selective high-resolution reconstruction from 1024x-compressed latents only where needed.

Towards 1000-fold Electron Microscopy Image Compression for Connectomics via VQ-VAE with Transformer Prior

TL;DR

, while preserving neuronal structures. It presents a two-level VQ-VAE architecture with top and bottom token latents, FiLM-based fusion, and an ROI-driven workflow for selective high-resolution reconstruction, achieving competitive SSIM and robust downstream task performance across datasets. The work demonstrates near-parity with AVIF at moderate compression and strong maintenance of segmentation and synapse detection at extreme ratios, plus a practical mechanism to extract high-resolution regions on demand. This approach lays the groundwork for a foundation-model-like, token-based EM compression framework that can generalize across connectomic datasets and support scalable, on-demand analysis.

Towards 1000-fold Electron Microscopy Image Compression for Connectomics via VQ-VAE with Transformer Prior

TL;DR

Abstract

Towards 1000-fold Electron Microscopy Image Compression for Connectomics via VQ-VAE with Transformer Prior

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)