Table of Contents
Fetching ...

Combine and Conquer: A Meta-Analysis on Data Shift and Out-of-Distribution Detection

Eduardo Dadalto, Florence Alberge, Pierre Duhamel, Pablo Piantanida

TL;DR

This paper proposes a quantile normalization to map these scores into p-values, effectively framing the problem into a multi-variate hypothesis test and creates a probabilistic interpretable criterion by mapping the final statistics into a distribution with known parameters.

Abstract

This paper introduces a universal approach to seamlessly combine out-of-distribution (OOD) detection scores. These scores encompass a wide range of techniques that leverage the self-confidence of deep learning models and the anomalous behavior of features in the latent space. Not surprisingly, combining such a varied population using simple statistics proves inadequate. To overcome this challenge, we propose a quantile normalization to map these scores into p-values, effectively framing the problem into a multi-variate hypothesis test. Then, we combine these tests using established meta-analysis tools, resulting in a more effective detector with consolidated decision boundaries. Furthermore, we create a probabilistic interpretable criterion by mapping the final statistics into a distribution with known parameters. Through empirical investigation, we explore different types of shifts, each exerting varying degrees of impact on data. Our results demonstrate that our approach significantly improves overall robustness and performance across diverse OOD detection scenarios. Notably, our framework is easily extensible for future developments in detection scores and stands as the first to combine decision boundaries in this context. The code and artifacts associated with this work are publicly available\footnote{\url{https://github.com/edadaltocg/detectors}}.

Combine and Conquer: A Meta-Analysis on Data Shift and Out-of-Distribution Detection

TL;DR

This paper proposes a quantile normalization to map these scores into p-values, effectively framing the problem into a multi-variate hypothesis test and creates a probabilistic interpretable criterion by mapping the final statistics into a distribution with known parameters.

Abstract

This paper introduces a universal approach to seamlessly combine out-of-distribution (OOD) detection scores. These scores encompass a wide range of techniques that leverage the self-confidence of deep learning models and the anomalous behavior of features in the latent space. Not surprisingly, combining such a varied population using simple statistics proves inadequate. To overcome this challenge, we propose a quantile normalization to map these scores into p-values, effectively framing the problem into a multi-variate hypothesis test. Then, we combine these tests using established meta-analysis tools, resulting in a more effective detector with consolidated decision boundaries. Furthermore, we create a probabilistic interpretable criterion by mapping the final statistics into a distribution with known parameters. Through empirical investigation, we explore different types of shifts, each exerting varying degrees of impact on data. Our results demonstrate that our approach significantly improves overall robustness and performance across diverse OOD detection scenarios. Notably, our framework is easily extensible for future developments in detection scores and stands as the first to combine decision boundaries in this context. The code and artifacts associated with this work are publicly available\footnote{\url{https://github.com/edadaltocg/detectors}}.
Paper Structure (24 sections, 14 equations, 12 figures, 5 tables, 1 algorithm)

This paper contains 24 sections, 14 equations, 12 figures, 5 tables, 1 algorithm.

Figures (12)

  • Figure 1: Illustration of the three steps of the Combine and Conquer algorithm. This example shows three disparate score functions evaluated on in-distribution data. Our main experiments combine 14 scores.
  • Figure 2: Test statistic distributional behavior and detection performance as a function of the novelty shift intensity and window size. Experiments ran for Fisher's method on a ResNet-50.
  • Figure 3: Data stream monitoring with correlation $\rho=0.98$
  • Figure 4: Ranking in terms of AUROC for a few selected methods for the ResNet-50 model. Note that the two displayed methods to combining tests obtain a top-5 ranking in every dataset, while state-of-the-art individual detectors vary significantly in performance.
  • Figure 5: Independent data shift (OpenImage-O) detection performance on a ResNet-50 model (ImageNet).
  • ...and 7 more figures

Theorems & Definitions (3)

  • Definition 3.1: Data stream
  • Remark 1
  • Remark 2