Domain Generalization for Medical Image Analysis: A Review
Jee Seok Yoon, Kwanseok Oh, Yooseung Shin, Maciej A. Mazurowski, Heung-Il Suk
TL;DR
The paper tackles the challenge of domain shift in medical image analysis by providing a system-wide review of domain generalization (DG) techniques tailored to MedIA. It introduces a four-level taxonomy—data-level, feature-level, model-level, and analysis-level—and maps these methods onto the full MedIA workflow, from data acquisition to analysis. The authors analyze the strengths and limitations of each approach, discuss extreme source/target constraints, and propose future directions including medical foundation models and standardized benchmarks. The work aims to guide researchers and engineers in building robust, transferable MedIA systems with practical clinical impact. Overall, it highlights the importance of integrating DG across all stages of MedIA to improve reliability, safety, and generalizability in diverse clinical environments.
Abstract
Medical image analysis (MedIA) has become an essential tool in medicine and healthcare, aiding in disease diagnosis, prognosis, and treatment planning, and recent successes in deep learning (DL) have made significant contributions to its advances. However, deploying DL models for MedIA in real-world situations remains challenging due to their failure to generalize across the distributional gap between training and testing samples - a problem known as domain shift. Researchers have dedicated their efforts to developing various DL methods to adapt and perform robustly on unknown and out-of-distribution (OOD) data distributions. This article comprehensively reviews domain generalization (DG) studies specifically tailored for MedIA. We provide a holistic view of how DG techniques interact within the broader MedIA system, going beyond methodologies to consider the operational implications on the entire MedIA workflow. Specifically, we categorize DG methods into data-level, feature-level, model-level, and analysis-level methods. We show how those methods can be used in various stages of the MedIA workflow with DL equipped from data acquisition to model prediction and analysis. Furthermore, we critically analyze the strengths and weaknesses of various methods, unveiling future research opportunities.
