AdaSCALE: Adaptive Scaling for OOD Detection

Sudarshan Regmi

AdaSCALE: Adaptive Scaling for OOD Detection

Sudarshan Regmi

TL;DR

AdaSCALE tackles out-of-distribution detection by introducing an adaptive per-sample scaling scheme that uses minor input perturbations to quantify OODness. The method computes a sample-specific percentile threshold via $p = p_{\min} + (1 - F_{Q'}(Q'))(p_{\max} - p_{\min})$ and derives a scaling factor $r$ to modulate activations or logits, with AdaSCALE-A and AdaSCALE-L as activation- and logit-based variants. Its core innovations are the activation-perturbation–based OODness metric $Q' = \lambda Q + C_o$ and the adaptive percentile mechanism, enabling state-of-the-art OOD detection across ImageNet-1k and CIFAR benchmarks with minimal reliance on ID statistics. The approach demonstrates strong generalization across architectures, robustness to corruptions and adversarial training, and practical utility for large-scale deployment with limited ID data. Collectively, AdaSCALE advances post-hoc OOD detection by marrying activation-space signals to adaptive, per-sample scaling, yielding significantly safer and more reliable predictions in real-world systems.

Abstract

The ability of the deep learning model to recognize when a sample falls outside its learned distribution is critical for safe and reliable deployment. Recent state-of-the-art out-of-distribution (OOD) detection methods leverage activation shaping to improve the separation between in-distribution (ID) and OOD inputs. These approaches resort to sample-specific scaling but apply a static percentile threshold across all samples regardless of their nature, resulting in suboptimal ID-OOD separability. In this work, we propose \textbf{AdaSCALE}, an adaptive scaling procedure that dynamically adjusts the percentile threshold based on a sample's estimated OOD likelihood. This estimation leverages our key observation: OOD samples exhibit significantly more pronounced activation shifts at high-magnitude activations under minor perturbation compared to ID samples. AdaSCALE enables stronger scaling for likely ID samples and weaker scaling for likely OOD samples, yielding highly separable energy scores. Our approach achieves state-of-the-art OOD detection performance, outperforming the latest rival OptFS by 14.94% in near-OOD and 21.67% in far-OOD datasets in average FPR@95 metric on the ImageNet-1k benchmark across eight diverse architectures. The code is available at: https://github.com/sudarshanregmi/AdaSCALE/

AdaSCALE: Adaptive Scaling for OOD Detection

TL;DR

Abstract

AdaSCALE: Adaptive Scaling for OOD Detection

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)