Out-of-distribution Detection in Medical Image Analysis: A survey
Zesheng Hong, Yubiao Yue, Yubin Chen, Lele Cong, Huanjie Lin, Yuanmei Luo, Mini Han Wang, Weidong Wang, Jialong Xu, Xiaoqi Yang, Hechang Chen, Zhenzhang Li, Sihong Xie
TL;DR
This survey addresses the critical need for trustworthy AI in medical imaging by formalizing OOD detection within clinical contexts. It introduces a tailored taxonomy of distributional shifts—contextual, semantic, and covariate—and maps a broad set of detection methods into a cohesive solution framework spanning post-hoc, training-based, and unsupervised approaches for both classification and segmentation. The authors review methodological developments, highlight paradigm-specific evaluation protocols, and discuss practical deployment considerations, including model reuse versus retraining. They also identify key gaps, such as semantic-shift detection in multi-label and cross-modality settings, and emphasize the necessity for standardized benchmarks and robust evaluation. Overall, the work offers a comprehensive roadmap for building reliable medical AI systems capable of abstaining from uncertain predictions and involving clinicians when necessary.
Abstract
Computer-aided diagnostics has benefited from the development of deep learning-based computer vision techniques in these years. Traditional supervised deep learning methods assume that the test sample is drawn from the identical distribution as the training data. However, it is possible to encounter out-of-distribution samples in real-world clinical scenarios, which may cause silent failure in deep learning-based medical image analysis tasks. Recently, research has explored various out-of-distribution (OOD) detection situations and techniques to enable a trustworthy medical AI system. In this survey, we systematically review the recent advances in OOD detection in medical image analysis. We first explore several factors that may cause a distributional shift when using a deep-learning-based model in clinic scenarios, with three different types of distributional shift well defined on top of these factors. Then a framework is suggested to categorize and feature existing solutions, while the previous studies are reviewed based on the methodology taxonomy. Our discussion also includes evaluation protocols and metrics, as well as the challenge and a research direction lack of exploration.
