Table of Contents
Fetching ...

WeiPer: OOD Detection using Weight Perturbations of Class Projections

Maximilian Granz, Manuel Heurich, Tim Landgraf

TL;DR

WeiPer proposes a simple yet effective post-hoc OOD detection method by perturbing the final-layer class projections to create an augmented logit space. It introduces two scoring approaches: MSP_W, which averages MSP over multiple perturbations, and WeiPer+KLD, a KL-divergence based detector that compares penultimate-layer distributions with their perturbed counterparts, optionally combined with MSP_W. Across CIFAR and ImageNet benchmarks within OpenOOD, WeiPer+KLD attains state-of-the-art performance on many near-OOD tasks, with substantial gains over strong baselines (notably on near-ImageNet using ResNet50); MSP_W and WeiPer+ReAct also show consistent improvements. Limitations include additional hyperparameters and increased memory usage for large perturbation spaces, while ViT backbones can experience diminished gains due to lower penultimate-space dimensionality. Overall, WeiPer provides a versatile, post-hoc enhancement to OOD detection that leverages structured class-projection perturbations to better separate ID and OOD distributions in practice.

Abstract

Recent advances in out-of-distribution (OOD) detection on image data show that pre-trained neural network classifiers can separate in-distribution (ID) from OOD data well, leveraging the class-discriminative ability of the model itself. Methods have been proposed that either use logit information directly or that process the model's penultimate layer activations. With "WeiPer", we introduce perturbations of the class projections in the final fully connected layer which creates a richer representation of the input. We show that this simple trick can improve the OOD detection performance of a variety of methods and additionally propose a distance-based method that leverages the properties of the augmented WeiPer space. We achieve state-of-the-art OOD detection results across multiple benchmarks of the OpenOOD framework, especially pronounced in difficult settings in which OOD samples are positioned close to the training set distribution. We support our findings with theoretical motivations and empirical observations, and run extensive ablations to provide insights into why WeiPer works.

WeiPer: OOD Detection using Weight Perturbations of Class Projections

TL;DR

WeiPer proposes a simple yet effective post-hoc OOD detection method by perturbing the final-layer class projections to create an augmented logit space. It introduces two scoring approaches: MSP_W, which averages MSP over multiple perturbations, and WeiPer+KLD, a KL-divergence based detector that compares penultimate-layer distributions with their perturbed counterparts, optionally combined with MSP_W. Across CIFAR and ImageNet benchmarks within OpenOOD, WeiPer+KLD attains state-of-the-art performance on many near-OOD tasks, with substantial gains over strong baselines (notably on near-ImageNet using ResNet50); MSP_W and WeiPer+ReAct also show consistent improvements. Limitations include additional hyperparameters and increased memory usage for large perturbation spaces, while ViT backbones can experience diminished gains due to lower penultimate-space dimensionality. Overall, WeiPer provides a versatile, post-hoc enhancement to OOD detection that leverages structured class-projection perturbations to better separate ID and OOD distributions in practice.

Abstract

Recent advances in out-of-distribution (OOD) detection on image data show that pre-trained neural network classifiers can separate in-distribution (ID) from OOD data well, leveraging the class-discriminative ability of the model itself. Methods have been proposed that either use logit information directly or that process the model's penultimate layer activations. With "WeiPer", we introduce perturbations of the class projections in the final fully connected layer which creates a richer representation of the input. We show that this simple trick can improve the OOD detection performance of a variety of methods and additionally propose a distance-based method that leverages the properties of the augmented WeiPer space. We achieve state-of-the-art OOD detection results across multiple benchmarks of the OpenOOD framework, especially pronounced in difficult settings in which OOD samples are positioned close to the training set distribution. We support our findings with theoretical motivations and empirical observations, and run extensive ablations to provide insights into why WeiPer works.
Paper Structure (29 sections, 1 theorem, 21 equations, 8 figures, 10 tables)

This paper contains 29 sections, 1 theorem, 21 equations, 8 figures, 10 tables.

Key Result

Theorem A.1

Let X and Y be two $\mathbb{R}^K$-valued random vectors. Suppose the absolute moments $m_k := \mathbb{E}(\|X\|^k)$ are finite and $\sum_{k=1}^\infty (m_k)^{-1/k} = \infty$. If the set $W_{XY} = \{\boldsymbol w \in \mathbb{R}^K : \boldsymbol w^T X \,{{\overset{d}{=}}}\, \boldsymbol w^T Y \}$ has posi

Figures (8)

  • Figure 1: Why random perturbations? Left: We visualize densities of CIFAR10 (ID, blue) and CIFAR100 (OOD, red) as contour plots along the two logit dimensions spanned by $\boldsymbol w_0$ and $\boldsymbol w_1$, zoomed in on the positive cluster of class zero. The blue axis denotes the vector associated with that class, and one of its perturbations is depicted by the turquoise line. Right: When projecting the data onto both vectors, we obtain the densities shown in the top and bottom panel, respectively. The vertical blue lines mark the 5-percentile (highest 5%) of the true ID data (CIFAR10, blue). At this decision boundary, the classifier would produce false positives in the marked dashed red tail area. A single perturbation of the class-associated vector yields already a reduction of the false positive rate (FPR) from $1.34$% to $0.79$%. Visually, we confirm that OOD data mostly resides close to 0, extending into the positive cluster in a particular conical shape, which is exploited by the cone of WeiPer vectors.
  • Figure 2: WeiPer perturbs the weight vectors of $\boldsymbol{W}_\text{fc}$ by an angle controlled by $\delta$. For each weight, we construct $r$ perturbations resulting in $r$ weight matrices $\tilde{\boldsymbol W}_1,...,\tilde{\boldsymbol W}_r$. KLD: For WeiPer+KLD, we treat $z_1,...,z_k\sim p_{\boldsymbol z}$ and $w_{1,1}^T\boldsymbol{z},...,w_{r,C}^T\boldsymbol{z}\sim p_{\tilde{W}\boldsymbol z}$ as samples of the same distribution induced by $z$ and $\tilde{W}z$, respectively. We approximate the densities with histograms and smooth the result with uniform kernel $T_{k_s}$. Afterwards, we compare the densities $T_{k_s}(q_{\boldsymbol{z}})$ with the mean distribution over the training samples $\mathbb{E}_{\boldsymbol z\in Z_\text{train}}( q_{\boldsymbol z})$ for $q_{\boldsymbol{z}}=p_{\boldsymbol{z}}$ and $q_{\boldsymbol{z}}=p_{\tilde{\boldsymbol W}\boldsymbol{z}}$, respectively. MSP: For a score function $S$ on the logit space $\mathbb{R}^C$, we define the perturbed score $S_W$ as the mean over all the perturbed logit spaces $\tilde{\boldsymbol{W}}\boldsymbol{z}$. We choose $S=\operatorname{MSP}$ and call the resulting detector $\operatorname{MSP}_W$.
  • Figure 3: Histogram of all 512 activations in the penultimate layer (left pair) and the activations in WeiPer space (right pair) of a ResNet18 trained on CIFAR10. We perturb the weight matrix $100$ times to produce a $10\cdot100 = 1000$-dimensional perturbed logit space. For each pair, the left panel shows the mean distribution over all samples (ID = CIFAR10, OOD = CIFAR100). The right panels show the distribution $p_{\boldsymbol z}$ and $p_{\tilde{\boldsymbol W}\boldsymbol{z}}$, respectively, for two randomly chosen samples with smoothing applied ($s_1=s_2=2$)
  • Figure 4: We investigate the effect of WeiPer hyperparameters $r$ and $\delta$ on the performance of the three postprocessors. The left pair shows results on CIFAR10, the right pair corresponds to ImageNet (using ResNet18 for both). Models were tested using their respective near OOD datasets. The panels corresponding to $\delta$ depict AUROC performance minus the initial AUROC performance at $\delta=0$. The graphs show the mean over 25 runs and the shaded area around them represents the value range (min to max) over those runs. All other parameters of the methods were fixed to the optimal setting. We also show the memory consumption by translating $r$ to the memory footprint (in GiB) of the perturbed latents (train, test and OOD sets).
  • Figure 5: We applied PCA to $Z_\text{Train}$ of CIFAR10 and projected $Z_\text{train}$ (blue), $Z_\text{test}$ (purple) and $Z_\text{ood}$ (red, CIFAR100) to the first 20 principal components. We observe density spikes in the first 10 dimensions, likely corresponding to the class clusters. The dimensions 10-19 exhibit less structure as their densities appear to be Gaussian. Along these directions the ID and OOD data are more similar compared to the first ten principal components.
  • ...and 3 more figures

Theorems & Definitions (2)

  • Theorem A.1: Theorem
  • proof