FPL+: Filtered Pseudo Label-based Unsupervised Cross-Modality Adaptation for 3D Medical Image Segmentation
Jianghao Wu, Dong Guo, Guotai Wang, Qiang Yue, Huijun Yu, Kang Li, Shaoting Zhang
TL;DR
This work tackles cross-modality unsupervised domain adaptation for 3D medical image segmentation by generating high-quality pseudo labels in the target domain and learning from a robust joint training set. It introduces Cross-Domain Data Augmentation to create pseudo source/target pairs, a Dual-Domain pseudo label Generator with Dual-BN to handle domain shifts, and a two-tier pseudo-label filtering strategy (size-aware uncertainty and dual-domain consensus) to downweight noisy labels. A final segmentor is trained from both labeled source data and target-domain pseudo-labeled data, leveraging initialization from the pseudo-label generator. Across VS, BraTS, and MMWHS, FPL+ consistently outperforms state-of-the-art UDA methods and, in some cases, approaches or surpasses fully supervised performance on the target domain, underscoring its practical impact for resource-efficient multi-modal medical image segmentation.
Abstract
Adapting a medical image segmentation model to a new domain is important for improving its cross-domain transferability, and due to the expensive annotation process, Unsupervised Domain Adaptation (UDA) is appealing where only unlabeled images are needed for the adaptation. Existing UDA methods are mainly based on image or feature alignment with adversarial training for regularization, and they are limited by insufficient supervision in the target domain. In this paper, we propose an enhanced Filtered Pseudo Label (FPL+)-based UDA method for 3D medical image segmentation. It first uses cross-domain data augmentation to translate labeled images in the source domain to a dual-domain training set consisting of a pseudo source-domain set and a pseudo target-domain set. To leverage the dual-domain augmented images to train a pseudo label generator, domain-specific batch normalization layers are used to deal with the domain shift while learning the domain-invariant structure features, generating high-quality pseudo labels for target-domain images. We then combine labeled source-domain images and target-domain images with pseudo labels to train a final segmentor, where image-level weighting based on uncertainty estimation and pixel-level weighting based on dual-domain consensus are proposed to mitigate the adverse effect of noisy pseudo labels. Experiments on three public multi-modal datasets for Vestibular Schwannoma, brain tumor and whole heart segmentation show that our method surpassed ten state-of-the-art UDA methods, and it even achieved better results than fully supervised learning in the target domain in some cases.
