Unpaired Modality Translation for Pseudo Labeling of Histology Images
Arthur Boschet, Armand Collin, Nishka Katoch, Julien Cohen-Adad
TL;DR
The paper tackles sparse annotations in histology image segmentation by introducing a microscopy pseudo labeling pipeline that leverages unpaired image translation between labeled ($L$) and unlabeled ($U$) domains. It combines SynDiff-based denoising diffusion GAN translations with two pseudo labeling strategies—tutorship ($L \rightarrow U$) and adaptation ($U \rightarrow L$)—to produce usable pseudo labels for $X_U$ and train/apply segmentation models. Across three domain-shift experiments (TEM↔TEM-MACAQUE, TEM↔SEM, TEM↔BF), tutoring performs best under moderate shifts (e.g., SEM) with a mean axon Dice of $0.736 \pm 0.005$ and myelin Dice of $0.652 \pm 0.005$, while adaptation can excel when shifts are larger (BF); pre-trained TEM models often fail on dissimilar targets. The approach demonstrates that pseudo labeling can substantially accelerate annotation by providing high-quality starting masks, with guidance on when to deploy adaptive versus tutoring paths across modality and domain differences.
Abstract
The segmentation of histological images is critical for various biomedical applications, yet the lack of annotated data presents a significant challenge. We propose a microscopy pseudo labeling pipeline utilizing unsupervised image translation to address this issue. Our method generates pseudo labels by translating between labeled and unlabeled domains without requiring prior annotation in the target domain. We evaluate two pseudo labeling strategies across three image domains increasingly dissimilar from the labeled data, demonstrating their effectiveness. Notably, our method achieves a mean Dice score of $0.736 \pm 0.005$ on a SEM dataset using the tutoring path, which involves training a segmentation model on synthetic data created by translating the labeled dataset (TEM) to the target modality (SEM). This approach aims to accelerate the annotation process by providing high-quality pseudo labels as a starting point for manual refinement.
