Table of Contents
Fetching ...

ImmuVis: Hyperconvolutional Foundation Model for Imaging Mass Cytometry

Marcin Możejko, Dawid Uchal, Krzysztof Gogolewski, Piotr Kupidura, Szymon Łukasik, Jakub Giezgała, Tomasz Nocoń, Kacper Pietrzyk, Robert Pieniuta, Mateusz Sulimowicz, Michal Orzyłowski, Tomasz Siłkowski, Karol Zagródka, Eike Staub, Ewa Szczurek

TL;DR

ImmuVis tackles the challenge of highly variable IMC marker panels by introducing marker-adaptive hyperconvolutions that condition convolutional kernels on learned marker embeddings, enabling a single foundation model to process arbitrary marker subsets without retraining. The model learns a pan-marker latent space and uses marker-conditioned hyperconvolutions in both encoder and decoder to produce per-marker reconstructions along with calibrated uncertainty via a Gaussian heteroscedastic objective. Pretrained on IMC17M, ImmuVis achieves state-of-the-art virtual staining, strong zero-shot performance across unseen panels, and competitive representations for cell typing and clinical prediction, with substantial efficiency advantages over transformer-based counterparts. The approach provides a practical, panel-flexible IMC model that delivers reliability-aware predictions and scalable deployment across cohorts, while outlining paths to extend to additional multiplex modalities and whole-slide clinical workflows.

Abstract

We present ImmuVis, an efficient convolutional foundation model for imaging mass cytometry (IMC), a high-throughput multiplex imaging technology that handles molecular marker measurements as image channels and enables large-scale spatial tissue profiling. Unlike natural images, multiplex imaging lacks a fixed channel space, as real-world marker sets vary across studies, violating a core assumption of standard vision backbones. To address this, ImmuVis introduces marker-adaptive hyperconvolutions that generate convolutional kernels from learned marker embeddings, enabling a single model to operate on arbitrary measured marker subsets without retraining. We pretrain ImmuVis on the largest to-date dataset, IMC17M (28 cohorts, 24,405 images, 265 markers, over 17M patches), using self-supervised masked reconstruction. ImmuVis outperforms SOTA baselines and ablations in virtual staining and downstream classification tasks at substantially lower compute cost than transformer-based alternatives, and is the sole model that provides calibrated uncertainty via a heteroscedastic likelihood objective. These results position ImmuVis as a practical, efficient foundation model for real-world IMC modeling.

ImmuVis: Hyperconvolutional Foundation Model for Imaging Mass Cytometry

TL;DR

ImmuVis tackles the challenge of highly variable IMC marker panels by introducing marker-adaptive hyperconvolutions that condition convolutional kernels on learned marker embeddings, enabling a single foundation model to process arbitrary marker subsets without retraining. The model learns a pan-marker latent space and uses marker-conditioned hyperconvolutions in both encoder and decoder to produce per-marker reconstructions along with calibrated uncertainty via a Gaussian heteroscedastic objective. Pretrained on IMC17M, ImmuVis achieves state-of-the-art virtual staining, strong zero-shot performance across unseen panels, and competitive representations for cell typing and clinical prediction, with substantial efficiency advantages over transformer-based counterparts. The approach provides a practical, panel-flexible IMC model that delivers reliability-aware predictions and scalable deployment across cohorts, while outlining paths to extend to additional multiplex modalities and whole-slide clinical workflows.

Abstract

We present ImmuVis, an efficient convolutional foundation model for imaging mass cytometry (IMC), a high-throughput multiplex imaging technology that handles molecular marker measurements as image channels and enables large-scale spatial tissue profiling. Unlike natural images, multiplex imaging lacks a fixed channel space, as real-world marker sets vary across studies, violating a core assumption of standard vision backbones. To address this, ImmuVis introduces marker-adaptive hyperconvolutions that generate convolutional kernels from learned marker embeddings, enabling a single model to operate on arbitrary measured marker subsets without retraining. We pretrain ImmuVis on the largest to-date dataset, IMC17M (28 cohorts, 24,405 images, 265 markers, over 17M patches), using self-supervised masked reconstruction. ImmuVis outperforms SOTA baselines and ablations in virtual staining and downstream classification tasks at substantially lower compute cost than transformer-based alternatives, and is the sole model that provides calibrated uncertainty via a heteroscedastic likelihood objective. These results position ImmuVis as a practical, efficient foundation model for real-world IMC modeling.
Paper Structure (54 sections, 25 equations, 6 figures, 5 tables)

This paper contains 54 sections, 25 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Motivation for ImmuVis: real-world panel heterogeneity. IMC17M exhibits strong cohort--marker diversity (rings: cohorts; radii: markers; colored ticks: measured markers), motivating a single model that operates on arbitrary marker subsets. ImmuVis encodes any observed panel (e.g., MITF, p53, pNFkB) into a shared pan-marker latent space for downstream phenotyping (cell typing, clinical prediction) and instantiates a task-specific decoder to virtually stain requested targets (e.g., CD45RO), outputting both expression and predictive uncertainty.
  • Figure 2: ImmuVis architecture overview. Marker-agnostic encoder stems embed each input marker channel and a hyperconvolution module, conditioned on learned marker embeddings, fuses them into a shared pan-marker representation processed by a standard vision backbone. A symmetric hyperconvolution and marker-agnostic decoder instantiates the requested output marker set, predicting per-marker reconstructions together with pixel-wise uncertainty (heteroscedastic log-variance).
  • Figure 3: Quantitative virtual staining evaluation on the Head & Neck cohort. Per-marker reconstruction accuracy measured by image-level $\log(\mathrm{MSE})$ for $\texttt{ImmuVis}$, $\texttt{ImmuVis}_{\texttt{ViT}}$, and VirTues. Each boxplot summarizes the distribution of image-level scores, where each image score is obtained by averaging patch-level errors over all patches from that image. The three rows above the plot report paired Wilcoxon signed-rank tests results across images for the corresponding model pairs (as indicated in the left top corner), with significance after FDR correction (ns - not significant; $(^{*}) <\!0.05$; $(^{**}) <\!0.01$; $(^{***}) <\!0.001$).
  • Figure 4: Qualitative virtual staining results under masking and zero-shot settings. Representative patches from Head and Neck cohort immucan_2025 for four markers (CD45RO, HLADR, H3, Ki67). Columns show the Ground Truth channel, Reconstruction by VirTues and ImmuVis in leave-one-out setting, prediction uncertainty $\sigma^2$ map, and squared error $(\mathbf{X}^{\mu}_{c}\!-\! \mathbf{X}_c)^2$. ImmuVis preserves spatial structure and produces coherent zero-shot reconstructions, with uncertainty highlighting lower-confidence regions (see Ki67 example).
  • Figure 5: MAE vs uncertainty correlation for marker reconstruction. Scatter plots on log-log scale for full (left) and zero-shot (right) $\texttt{ImmuVis}$ models; trained with and without Head and Neck cohort, respectively immucan_2025. Active channels (blue, $n\!=\!4291$) and Masked channels (green, $n\!=\!1669$) both show strong positive correlation between prediction error and uncertainty for all cases. Dashed lines show linear regression fits.
  • ...and 1 more figures