Are We Ready for Out-of-Distribution Detection in Digital Pathology?
Ji-Hun Oh, Kianoush Falahkheirkhah, Rohit Bhargava
TL;DR
This work tackles the challenge of out-of-distribution detection in digital pathology by defining semantic OOD and misclassified covariate OOD and proposing an objective OSR-based benchmark. It systematically evaluates a broad set of post-hoc detectors across diverse TL strategies and backbone architectures (CNNs vs transformers) on two public DP datasets, BreakHis and NCT-CRC. Key findings show no universal detector for both OOD types; ViM and related feature-space methods excel for S-OODD, while simpler uncertainty-based measures fare relatively well for MC-OODD, with DP-specific TL and choice of architecture offering nuanced gains. The study delivers reproducible protocols and practical guidelines for DP OOD robustness, emphasizing that effective detection often depends on task and data characteristics rather than a single best method, and it calls for future work to refine TL choices and detector designs in DP contexts.
Abstract
The detection of semantic and covariate out-of-distribution (OOD) examples is a critical yet overlooked challenge in digital pathology (DP). Recently, substantial insight and methods on OOD detection were presented by the ML community, but how do they fare in DP applications? To this end, we establish a benchmark study, our highlights being: 1) the adoption of proper evaluation protocols, 2) the comparison of diverse detectors in both a single and multi-model setting, and 3) the exploration into advanced ML settings like transfer learning (ImageNet vs. DP pre-training) and choice of architecture (CNNs vs. transformers). Through our comprehensive experiments, we contribute new insights and guidelines, paving the way for future research and discussion.
