Towards Optimal Feature-Shaping Methods for Out-of-Distribution Detection
Qinyu Zhao, Ming Xu, Kartik Gupta, Akshay Asthana, Liang Zheng, Stephen Gould
TL;DR
This work addresses the fragility of state-of-the-art feature-shaping methods for out-of-distribution detection across diverse models. It introduces a general optimization framework for shaping penultimate features and derives a concrete piecewise-constant reshaping that explains how existing methods operate, while also providing a novel ID-data-only solution with a closed-form. Empirically, the proposed ID-only shaping method generalizes robustly across backbones (ConvNets, ViT, MLP) and datasets, outperforming many baselines and maintaining gains where previous methods fail. The approach offers a practical, architecture-agnostic pathway to improve OOD detection in real-world systems by leveraging only in-distribution data for tuning and inference-time feature reshaping.
Abstract
Feature shaping refers to a family of methods that exhibit state-of-the-art performance for out-of-distribution (OOD) detection. These approaches manipulate the feature representation, typically from the penultimate layer of a pre-trained deep learning model, so as to better differentiate between in-distribution (ID) and OOD samples. However, existing feature-shaping methods usually employ rules manually designed for specific model architectures and OOD datasets, which consequently limit their generalization ability. To address this gap, we first formulate an abstract optimization framework for studying feature-shaping methods. We then propose a concrete reduction of the framework with a simple piecewise constant shaping function and show that existing feature-shaping methods approximate the optimal solution to the concrete optimization problem. Further, assuming that OOD data is inaccessible, we propose a formulation that yields a closed-form solution for the piecewise constant shaping function, utilizing solely the ID data. Through extensive experiments, we show that the feature-shaping function optimized by our method improves the generalization ability of OOD detection across a large variety of datasets and model architectures.
