Source Matters: Source Dataset Impact on Model Robustness in Medical Imaging
Dovile Juodelyte, Yucheng Lu, Amelia Jiménez-Sánchez, Sabrina Bottazzi, Enzo Ferrante, Veronika Cheplygina
TL;DR
This paper tackles how the pretraining source dataset domain affects generalization in medical imaging, arguing that cross-domain transfer can foster shortcut learning. It introduces MICCAT, a taxonomy of contextualized confounders, and a principled experimental design that compares ImageNet versus RadImageNet under controlled confounders on chest X-ray and CT tasks. The key finding is that RadImageNet matches ImageNet in i.i.d. performance but is more robust to out-of-distribution confounders, indicating that source domain selection critically shapes robustness beyond accuracy. The work advocates for confounder-aware evaluation of transfer learning in clinical settings and provides public code to enable broader, rigorous assessments of model robustness.
Abstract
Transfer learning has become an essential part of medical imaging classification algorithms, often leveraging ImageNet weights. The domain shift from natural to medical images has prompted alternatives such as RadImageNet, often showing comparable classification performance. However, it remains unclear whether the performance gains from transfer learning stem from improved generalization or shortcut learning. To address this, we conceptualize confounders by introducing the Medical Imaging Contextualized Confounder Taxonomy (MICCAT) and investigate a range of confounders across it -- whether synthetic or sampled from the data -- using two public chest X-ray and CT datasets. We show that ImageNet and RadImageNet achieve comparable classification performance, yet ImageNet is much more prone to overfitting to confounders. We recommend that researchers using ImageNet-pretrained models reexamine their model robustness by conducting similar experiments. Our code and experiments are available at https://github.com/DovileDo/source-matters.
