OpenMIBOOD: Open Medical Imaging Benchmarks for Out-Of-Distribution Detection

Max Gutbrod; David Rauber; Danilo Weber Nunes; Christoph Palm

OpenMIBOOD: Open Medical Imaging Benchmarks for Out-Of-Distribution Detection

Max Gutbrod, David Rauber, Danilo Weber Nunes, Christoph Palm

TL;DR

OpenMIBOOD introduces three medical-imaging OOD benchmarks (MIDOG, PHAKIR, OASIS3) comprising 14 datasets to systematically evaluate post-hoc OOD detectors under csID, nOOD, and fOOD conditions. Using an OpenOOD-inspired framework, the study finds that methods trained on natural images do not generalize well to medical data, with feature-space approaches such as MDSEns and ViM outperforming probability-based methods in most cases. The authors provide standardized dataset splits, metrics (AUROC, FPR@95, AUPRIN/AUPROUT, harmonic mean), and a public codebase, revealing dataset-specific challenges that limit OOD detection in healthcare. The work emphasizes the necessity of domain-specific benchmarks for trustworthy AI in medicine and outlines avenues for extending to segmentation tasks and broader evaluation beyond classification.

Abstract

The growing reliance on Artificial Intelligence (AI) in critical domains such as healthcare demands robust mechanisms to ensure the trustworthiness of these systems, especially when faced with unexpected or anomalous inputs. This paper introduces the Open Medical Imaging Benchmarks for Out-Of-Distribution Detection (OpenMIBOOD), a comprehensive framework for evaluating out-of-distribution (OOD) detection methods specifically in medical imaging contexts. OpenMIBOOD includes three benchmarks from diverse medical domains, encompassing 14 datasets divided into covariate-shifted in-distribution, near-OOD, and far-OOD categories. We evaluate 24 post-hoc methods across these benchmarks, providing a standardized reference to advance the development and fair comparison of OOD detection methods. Results reveal that findings from broad-scale OOD benchmarks in natural image domains do not translate to medical applications, underscoring the critical need for such benchmarks in the medical field. By mitigating the risk of exposing AI models to inputs outside their training distribution, OpenMIBOOD aims to support the advancement of reliable and trustworthy AI systems in healthcare. The repository is available at https://github.com/remic-othr/OpenMIBOOD.

OpenMIBOOD: Open Medical Imaging Benchmarks for Out-Of-Distribution Detection

TL;DR

Abstract

OpenMIBOOD: Open Medical Imaging Benchmarks for Out-Of-Distribution Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)