Table of Contents
Fetching ...

Towards Optimal Feature-Shaping Methods for Out-of-Distribution Detection

Qinyu Zhao, Ming Xu, Kartik Gupta, Akshay Asthana, Liang Zheng, Stephen Gould

TL;DR

This work addresses the fragility of state-of-the-art feature-shaping methods for out-of-distribution detection across diverse models. It introduces a general optimization framework for shaping penultimate features and derives a concrete piecewise-constant reshaping that explains how existing methods operate, while also providing a novel ID-data-only solution with a closed-form. Empirically, the proposed ID-only shaping method generalizes robustly across backbones (ConvNets, ViT, MLP) and datasets, outperforming many baselines and maintaining gains where previous methods fail. The approach offers a practical, architecture-agnostic pathway to improve OOD detection in real-world systems by leveraging only in-distribution data for tuning and inference-time feature reshaping.

Abstract

Feature shaping refers to a family of methods that exhibit state-of-the-art performance for out-of-distribution (OOD) detection. These approaches manipulate the feature representation, typically from the penultimate layer of a pre-trained deep learning model, so as to better differentiate between in-distribution (ID) and OOD samples. However, existing feature-shaping methods usually employ rules manually designed for specific model architectures and OOD datasets, which consequently limit their generalization ability. To address this gap, we first formulate an abstract optimization framework for studying feature-shaping methods. We then propose a concrete reduction of the framework with a simple piecewise constant shaping function and show that existing feature-shaping methods approximate the optimal solution to the concrete optimization problem. Further, assuming that OOD data is inaccessible, we propose a formulation that yields a closed-form solution for the piecewise constant shaping function, utilizing solely the ID data. Through extensive experiments, we show that the feature-shaping function optimized by our method improves the generalization ability of OOD detection across a large variety of datasets and model architectures.

Towards Optimal Feature-Shaping Methods for Out-of-Distribution Detection

TL;DR

This work addresses the fragility of state-of-the-art feature-shaping methods for out-of-distribution detection across diverse models. It introduces a general optimization framework for shaping penultimate features and derives a concrete piecewise-constant reshaping that explains how existing methods operate, while also providing a novel ID-data-only solution with a closed-form. Empirically, the proposed ID-only shaping method generalizes robustly across backbones (ConvNets, ViT, MLP) and datasets, outperforming many baselines and maintaining gains where previous methods fail. The approach offers a practical, architecture-agnostic pathway to improve OOD detection in real-world systems by leveraging only in-distribution data for tuning and inference-time feature reshaping.

Abstract

Feature shaping refers to a family of methods that exhibit state-of-the-art performance for out-of-distribution (OOD) detection. These approaches manipulate the feature representation, typically from the penultimate layer of a pre-trained deep learning model, so as to better differentiate between in-distribution (ID) and OOD samples. However, existing feature-shaping methods usually employ rules manually designed for specific model architectures and OOD datasets, which consequently limit their generalization ability. To address this gap, we first formulate an abstract optimization framework for studying feature-shaping methods. We then propose a concrete reduction of the framework with a simple piecewise constant shaping function and show that existing feature-shaping methods approximate the optimal solution to the concrete optimization problem. Further, assuming that OOD data is inaccessible, we propose a formulation that yields a closed-form solution for the piecewise constant shaping function, utilizing solely the ID data. Through extensive experiments, we show that the feature-shaping function optimized by our method improves the generalization ability of OOD detection across a large variety of datasets and model architectures.
Paper Structure (25 sections, 18 equations, 5 figures, 11 tables)

This paper contains 25 sections, 18 equations, 5 figures, 11 tables.

Figures (5)

  • Figure 1: Comparing our method with existing feature-shaping methods. The dashed lines denote the performance of our method for comparison. (a) ImageNet (ID) vs. iNaturalist (OOD) with ViT-B-16; (b) ImageNet (ID) vs. iNaturalist (OOD) with MLP-Mixer-B; (c) CIFAR100 (ID) vs. CIFAR10 (OOD) with MLP-Mixer-Nano; (d) Average performance of different methods across eight OOD datasets with two ConvNets and with four transformer-based models.
  • Figure 2: Visualization of shaping functions. The blue lines (ours w/ OOD) derive from Eq. \ref{['prob_opt_2']}, while the green line (ours w/o OOD) from Eq. \ref{['eq_btheta_opt']}. Red lines represent different existing methods, while shaded regions indicate estimated standard deviations. $\theta$ has been rescaled for the best visualization.
  • Figure 3: Diagram to show the intuition in deriving Eq. \ref{['eq_our_problem']}.
  • Figure 4: Compatibility and sensitivity analysis. (a-b) Our method can improve other OOD scores and methods. "Base" denotes using the original OOD score or method, while "+Our" indicates combining the score or method with our feature-shaping function. (c) Our method's performance with different hyperparameter settings, i.e., numbers of intervals $K$.
  • Figure 5: Empirical analysis to explain a specific form of the optimal shaping function.