Table of Contents
Fetching ...

CORE: Robust Out-of-Distribution Detection via Confidence and Orthogonal Residual Scoring

Jin Mo Yang, Hyung-Sin Kim, Saewoong Bahk

Abstract

Out-of-distribution (OOD) detection is essential for deploying deep learning models reliably, yet no single method performs consistently across architectures and datasets -- a scorer that leads on one benchmark often falters on another. We attribute this inconsistency to a shared structural limitation: logit-based methods see only the classifier's confidence signal, while feature-based methods attempt to measure membership in the training distribution but do so in the full feature space where confidence and membership are entangled, inheriting architecture-sensitive failure modes. We observe that penultimate features naturally decompose into two orthogonal subspaces: a classifier-aligned component encoding confidence, and a residual the classifier discards. We discover that this residual carries a class-specific directional signature for in-distribution data -- a membership signal invisible to logit-based methods and entangled with noise in feature-based methods. We propose CORE (COnfidence + REsidual), which disentangles the two signals by scoring each subspace independently and combines them via normalized summation. Because the two signals are orthogonal by construction, their failure modes are approximately independent, producing robust detection where either view alone is unreliable. CORE achieves competitive or state-of-the-art performance across five architectures and five benchmark configurations, ranking first in three of five settings and achieving the highest grand average AUROC with negligible computational overhead.

CORE: Robust Out-of-Distribution Detection via Confidence and Orthogonal Residual Scoring

Abstract

Out-of-distribution (OOD) detection is essential for deploying deep learning models reliably, yet no single method performs consistently across architectures and datasets -- a scorer that leads on one benchmark often falters on another. We attribute this inconsistency to a shared structural limitation: logit-based methods see only the classifier's confidence signal, while feature-based methods attempt to measure membership in the training distribution but do so in the full feature space where confidence and membership are entangled, inheriting architecture-sensitive failure modes. We observe that penultimate features naturally decompose into two orthogonal subspaces: a classifier-aligned component encoding confidence, and a residual the classifier discards. We discover that this residual carries a class-specific directional signature for in-distribution data -- a membership signal invisible to logit-based methods and entangled with noise in feature-based methods. We propose CORE (COnfidence + REsidual), which disentangles the two signals by scoring each subspace independently and combines them via normalized summation. Because the two signals are orthogonal by construction, their failure modes are approximately independent, producing robust detection where either view alone is unreliable. CORE achieves competitive or state-of-the-art performance across five architectures and five benchmark configurations, ranking first in three of five settings and achieving the highest grand average AUROC with negligible computational overhead.
Paper Structure (49 sections, 7 equations, 5 figures, 20 tables)

This paper contains 49 sections, 7 equations, 5 figures, 20 tables.

Figures (5)

  • Figure 1: Fragility of post-hoc OOD scorers across architectures and datasets. Catastrophic failures occur systematically for activation-shaping and feature-based methods on specific architectures (e.g., ViT/IN, Swin/IN). CORE (red) is consistently the most robust scorer across all settings.
  • Figure 2: Each column represents one scorer category: (a) logit-based energy score (far-OOD separates, near-OOD overlaps), (b) feature-based k-NN distance (R18/C100 separates, ViT/IN concentrates), (c) per-unit penultimate activations (CNN shows sparse OOD spikes above the ReAct threshold; ViT activations are symmetric with no clear ID/OOD separation). Top rows: settings where the approach succeeds; bottom rows: where it fails.
  • Figure 3: CORE decomposes features into classifier-aligned $z_\parallel$ and orthogonal residual $z_\perp$. The residual carries a membership signal that is more discriminative than confidence alone and invisible to logit-based methods.
  • Figure 4: Component scatter plots for three score combination methods on two representative settings (rows: R18/SVHN, R50/ImageNet-O). Left: CORE's confidence ($S_\text{conf}'$) and membership ($S_\text{mem}'$) are weakly correlated ($r{=}0.30$, $0.40$); ID samples (blue) cluster at high values on both axes while OOD samples (red) scatter across complementary failure regions. Center: NNGuide's energy and $k$-NN signals are entangled ($r{=}0.75$ on R18) and overlap substantially. Right: ComboOOD's Mahalanobis component collapses on CNN features (ID and OOD overlap entirely on the $x$-axis), and its $k$-NN component inherits hubness on ImageNet.
  • Figure 5: Combination weight $\alpha$ ablation. $S = \alpha \, S_\text{conf}' + (1{-}\alpha) \, S_\text{mem}'$. All settings peak near $\alpha{=}0.5$ with a wide plateau.