Table of Contents
Fetching ...

λSplit: Self-Supervised Content-Aware Spectral Unmixing for Fluorescence Microscopy

Federico Carrara, Talley Lambert, Mehdi Seifi, Florian Jug

Abstract

In fluorescence microscopy, spectral unmixing aims to recover individual fluorophore concentrations from spectral images that capture mixed fluorophore emissions. Since classical methods operate pixel-wise and rely on least-squares fitting, their performance degrades with increasingly overlapping emission spectra and higher levels of noise, suggesting that a data-driven approach that can learn and utilize a structural prior might lead to improved results. Learning-based approaches for spectral imaging do exist, but they are either not optimized for microscopy data or are developed for very specific cases that are not applicable to fluorescence microscopy settings. To address this, we propose λSplit, a physics-informed deep generative model that learns a conditional distribution over concentration maps using a hierarchical Variational Autoencoder. A fully differentiable Spectral Mixer enforces consistency with the image formation process, while the learned structural priors enable state-of-the-art unmixing and implicit noise removal. We demonstrate λSplit on 3 real-world datasets that we synthetically cast into a total of 66 challenging spectral unmixing benchmarks. We compare our results against a total of 10 baseline methods, including classical methods and a range of learning-based methods. Our results consistently show competitive performance and improved robustness in high noise regimes, when spectra overlap considerably, or when the spectral dimensionality is lowered, making λSplit a new state-of-the-art for spectral unmixing of fluorescent microscopy data. Importantly, λSplit is compatible with spectral data produced by standard confocal microscopes, enabling immediate adoption without specialized hardware modifications.

λSplit: Self-Supervised Content-Aware Spectral Unmixing for Fluorescence Microscopy

Abstract

In fluorescence microscopy, spectral unmixing aims to recover individual fluorophore concentrations from spectral images that capture mixed fluorophore emissions. Since classical methods operate pixel-wise and rely on least-squares fitting, their performance degrades with increasingly overlapping emission spectra and higher levels of noise, suggesting that a data-driven approach that can learn and utilize a structural prior might lead to improved results. Learning-based approaches for spectral imaging do exist, but they are either not optimized for microscopy data or are developed for very specific cases that are not applicable to fluorescence microscopy settings. To address this, we propose λSplit, a physics-informed deep generative model that learns a conditional distribution over concentration maps using a hierarchical Variational Autoencoder. A fully differentiable Spectral Mixer enforces consistency with the image formation process, while the learned structural priors enable state-of-the-art unmixing and implicit noise removal. We demonstrate λSplit on 3 real-world datasets that we synthetically cast into a total of 66 challenging spectral unmixing benchmarks. We compare our results against a total of 10 baseline methods, including classical methods and a range of learning-based methods. Our results consistently show competitive performance and improved robustness in high noise regimes, when spectra overlap considerably, or when the spectral dimensionality is lowered, making λSplit a new state-of-the-art for spectral unmixing of fluorescent microscopy data. Importantly, λSplit is compatible with spectral data produced by standard confocal microscopes, enabling immediate adoption without specialized hardware modifications.
Paper Structure (49 sections, 16 equations, 16 figures, 14 tables)

This paper contains 49 sections, 16 equations, 16 figures, 14 tables.

Figures (16)

  • Figure 1: Spectral Imaging and Unmixing in a nutshell. A biological sample is labeled with $F$ fluorescent probes (FPs) whose emission spectra typically overlap. While for multiplexed imaging each FP is imaged in isolation using $F$ appropriate band-pass filters (see Multiplexed Imaging Channels), spectral imaging, instead, acquires $L$ spectral bands in which photons of all FPs are unfiltered and therefore mixed (see Spectral Imaging Bands). Note that multiplexed imaging can suffer from bleed-through artifacts due to overlapping emission spectra. To retrieve all $F$ unmixed concentration maps (right) we propose $\lambda$Split, a self-supervised method that leverages on learned structural priors to achieve state-of-the-art results even when spectra overlap considerably, when imaging noise is high, and when the spectral dimensionality $L$ is lowered.
  • Figure 2: Proposed architecture of $\lambda$Split. The model builds on an LVAE backbone Sonderby2016-zg, where a bottom-up encoder produces features $h_i$ at multiple hierarchy levels. At the highest hierarchy level we employ a default multivariate Gaussian prior, followed by learnable top-down priors $p_i$ (conditioned from above) that, combined with the respective $h_i$, form the final posterior distributions $(\mu_{q,i}, \sigma_{q,i})$. Latent samples are generated by drawing from the topmost Gaussian and successively propagating and resampling down the hierarchy through a deterministic decoder to, finally, generate FP concentration maps. The last decoder layer comprises an unmixing convolutional block (blue) that maps the final activation into the desired unmixed concentration map tensor. To render our method fully self-supervised, a differentiable Spectral Mixer multiplies the predicted concentration maps by the transposed mixing matrix $M^T \in \mathbb{R}^{F \times L}$, obtainable by discretizing the emission spectra of the FPs contained in the imaged sample (bottom). This models the physical image formation process and, given that the predicted concentration maps are correct, reconstructs the spectral image we received as input, rendering this setup an end-to-end trainable variational autoencoder. The model is trained by minimizing the sum of the hierarchical KL divergence (computed from latent samples) and a spectral MSE reconstruction loss.
  • Figure 3: Qualitative results on high-noise BioSR data (exposure 10 ms). Compared to $7$ baselines, $\lambda$Split produces the cleanest results, also benefiting from its implicit noise removal capabilities. Objects and details (e.g., puncta in the CCPs channel or F-Actin structures) are well recovered, enabling best possible downstream analyses.
  • Figure 4: Spectral unmixing with controlled spectral overlap (32 bands). For this experiment, we use the Microtubules and CCPs structures from the BioSR dataset, with 32 spectral bands imaged. a,e Synthetically shifted EGFP and mTurquoise emission spectra, respectively. While we fix the spectrum associated with Microtubules, the emission spectrum of CCPs is rigidly shifted with increasing $\Delta\lambda$. b,f Quantitative metrics over $\Delta\lambda$, showing that $\lambda$Split (blue) consistently outperforms all other baselines and that the performance gap widens when spectral overlap is larger (hence, when $\Delta\lambda$ is smaller). c,d,g,h Representative crops from unmixing results for $\Delta\lambda=2$ nm by $\lambda$Split and the best-performing baseline (w.r.t. PSNR), plus the corresponding ground truth (GT) in the center. In c,d, we show Microtubules, while in g,h, we show CCPs.
  • Figure S.1: Overview of the spectral fluorescence microscopy simulation pipeline based on microsim Lambert2026-em. Multi-channel inputs are interpreted as fluorophore concentration maps $U_{\mathrm{GT}}$, where each channel corresponds to a fluorescent probe (FP) with a known emission spectrum. First, for each FP, a noise-free spectral emission volume $S_{\mathrm{emission}}$ is generated by combining concentration maps with their discretized emission spectra $M$, while modeling tunable illumination power and fluorophore-specific emission and excitation efficiency. Next, microscope-specific optical effects, such as point-spread functions (PSFs), are applied, and per-fluorophore emission volumes are combined to obtain an optical spectral volume $S_{\mathrm{opt}}$ consistent with the chosen imaging modality. Finally, the detection process introduces realistic noise sources, including photon and detector noise, producing the final digital spectral image $S_{\mathrm{dig}}$.
  • ...and 11 more figures