Improving Out-of-Distribution Detection by Combining Existing Post-hoc Methods
Paul Novello, Yannick Prudent, Joseba Dalmau, Corentin Friedrich, Yann Pequignot
TL;DR
The paper tackles the challenge that no single post-hoc OOD detector dominates across datasets by proposing to fuse many existing OOD scores. It introduces four multivariate combination strategies—Majority Vote, Empirical CDF, Copula-based CDF, and Center-Outward Quantiles—and extends evaluation metrics to multidimensional detectors. Through extensive experiments on OpenOOD across CIFAR-10/100 and ImageNet-200, it shows that score fusion consistently improves AUROC over the best individual detectors and provides practical guidelines for selecting combinations with or without access to OOD data, including Outlier Exposure. The approach is flexible, scalable to different tasks, and comes with open-source code to facilitate adoption in safety-critical applications. The work thus offers a principled, data-efficient route to more robust OOD detection by leveraging complementary information across many existing detectors.
Abstract
Since the seminal paper of Hendrycks et al. arXiv:1610.02136, Post-hoc deep Out-of-Distribution (OOD) detection has expanded rapidly. As a result, practitioners working on safety-critical applications and seeking to improve the robustness of a neural network now have a plethora of methods to choose from. However, no method outperforms every other on every dataset arXiv:2210.07242, so the current best practice is to test all the methods on the datasets at hand. This paper shifts focus from developing new methods to effectively combining existing ones to enhance OOD detection. We propose and compare four different strategies for integrating multiple detection scores into a unified OOD detector, based on techniques such as majority vote, empirical and copulas-based Cumulative Distribution Function modeling, and multivariate quantiles based on optimal transport. We extend common OOD evaluation metrics -- like AUROC and FPR at fixed TPR rates -- to these multi-dimensional OOD detectors, allowing us to evaluate them and compare them with individual methods on extensive benchmarks. Furthermore, we propose a series of guidelines to choose what OOD detectors to combine in more realistic settings, i.e. in the absence of known OOD data, relying on principles drawn from Outlier Exposure arXiv:1812.04606. The code is available at https://github.com/paulnovello/multi-ood.
