Label-Efficient Cross-Modality Generalization for Liver Segmentation in Multi-Phase MRI
Quang-Khai Bui-Tran, Minh-Toan Dinh, Thanh-Huy Nguyen, Ba-Thinh Lam, Mai-Anh Vu, Ulas Bagci
TL;DR
The paper tackles liver segmentation in multi-phase MRI under extreme annotation scarcity and cross-vendor heterogeneity. It introduces a label-efficient framework that fuses a foundation-model backbone (STU-Net) with cross pseudo supervision to exploit unlabeled GED4 and non-contrast volumes, enabled by nnU-Net preprocessing and ATLAS-based pretraining. Results on the LiQA dataset show strong performance in GED4 (DSC ≈ 0.969) and competitive non-contrast segmentation (e.g., T1WI DSC ≈ 0.947, T2WI DSC ≈ 0.767), demonstrating robust cross-modality generalization and transferability across centers. The approach avoids spatial registration, supports real-world clinical imaging tasks, and highlights the value of combining foundation-model adaptation with semi-supervised learning for label-efficient medical image analysis.
Abstract
Accurate liver segmentation in multi-phase MRI is vital for liver fibrosis assessment, yet labeled data is often scarce and unevenly distributed across imaging modalities and vendor systems. We propose a label-efficient segmentation approach that promotes cross-modality generalization under real-world conditions, where GED4 hepatobiliary-phase annotations are limited, non-contrast sequences (T1WI, T2WI, DWI) are unlabeled, and spatial misalignment and missing phases are common. Our method integrates a foundation-scale 3D segmentation backbone adapted via fine-tuning, co-training with cross pseudo supervision to leverage unlabeled volumes, and a standardized preprocessing pipeline. Without requiring spatial registration, the model learns to generalize across MRI phases and vendors, demonstrating robust segmentation performance in both labeled and unlabeled domains. Our results exhibit the effectiveness of our proposed label-efficient baseline for liver segmentation in multi-phase, multi-vendor MRI and highlight the potential of combining foundation model adaptation with co-training for real-world clinical imaging tasks.
