Test-Time Modality Generalization for Medical Image Segmentation
Ju-Hyeon Nam, Sang-Chul Lee
TL;DR
The paper tackles the problem of medical image segmentation generalization to unseen modalities. It introduces Test-Time Modality Generalization (TTMG), a framework combining Modality-Aware Style Projection (MASP) and Modality-Sensitive Instance Whitening (MSIW) to map unseen modality features to seen modality style distributions and to selectively whiten modality-sensitive covariances. MASP leverages a modality classifier and a prototype-style bases bank to project features, while MSIW computes covariance variance to isolate modality-sensitive channels and applies targeted whitening, all trained with a multi-term loss that preserves content and modality distinctions. Empirical results across 11 datasets spanning colonoscopy, ultrasound, dermoscopy, and radiology show that TTMG achieves superior unseen-modality segmentation without extra training, outperforming traditional DG methods and maintaining efficiency. The work highlights a practical, scalable solution for deploying a single model across diverse clinical settings, potentially reducing data curation and retraining needs in real-world medical imaging tasks.
Abstract
Generalizable medical image segmentation is essential for ensuring consistent performance across diverse unseen clinical settings. However, existing methods often overlook the capability to generalize effectively across arbitrary unseen modalities. In this paper, we introduce a novel Test-Time Modality Generalization (TTMG) framework, which comprises two core components: Modality-Aware Style Projection (MASP) and Modality-Sensitive Instance Whitening (MSIW), designed to enhance generalization in arbitrary unseen modality datasets. The MASP estimates the likelihood of a test instance belonging to each seen modality and maps it onto a distribution using modality-specific style bases, guiding its projection effectively. Furthermore, as high feature covariance hinders generalization to unseen modalities, the MSIW is applied during training to selectively suppress modality-sensitive information while retaining modality-invariant features. By integrating MASP and MSIW, the TTMG framework demonstrates robust generalization capabilities for medical image segmentation in unseen modalities a challenge that current methods have largely neglected. We evaluated TTMG alongside other domain generalization techniques across eleven datasets spanning four modalities (colonoscopy, ultrasound, dermoscopy, and radiology), consistently achieving superior segmentation performance across various modality combinations.
