Table of Contents
Fetching ...

GradPCA: Leveraging NTK Alignment for Reliable Out-of-Distribution Detection

Mariia Seleznova, Hung-Hsu Chou, Claudio Mayrink Verdun, Gitta Kutyniok

TL;DR

This work introduces GradPCA, a principled spectral OOD detector that exploits the NTK-aligned, low-rank structure of neural network gradients. By applying PCA to gradient class-means, GradPCA effectively captures a class-specific subspace in gradient space, enabling reliable OOD detection with strong theoretical grounding. The authors provide a spectral OOD framework, detailing sufficient and necessary conditions, and demonstrate GradPCA’s robust, near state-of-the-art performance across CIFAR and ImageNet benchmarks, while highlighting the critical role of feature quality (pretrained vs non-pretrained representations). The work also offers practical guidance for detector design, emphasizes consistency over ad hoc performance, and releases open-source code to foster reproducibility and further research in principled OOD detection. Overall, GradPCA bridges NTK theory with spectral OOD detection, enabling more reliable and interpretable detection in real-world deep learning systems.

Abstract

We introduce GradPCA, an Out-of-Distribution (OOD) detection method that exploits the low-rank structure of neural network gradients induced by Neural Tangent Kernel (NTK) alignment. GradPCA applies Principal Component Analysis (PCA) to gradient class-means, achieving more consistent performance than existing methods across standard image classification benchmarks. We provide a theoretical perspective on spectral OOD detection in neural networks to support GradPCA, highlighting feature-space properties that enable effective detection and naturally emerge from NTK alignment. Our analysis further reveals that feature quality -- particularly the use of pretrained versus non-pretrained representations -- plays a crucial role in determining which detectors will succeed. Extensive experiments validate the strong performance of GradPCA, and our theoretical framework offers guidance for designing more principled spectral OOD detectors.

GradPCA: Leveraging NTK Alignment for Reliable Out-of-Distribution Detection

TL;DR

This work introduces GradPCA, a principled spectral OOD detector that exploits the NTK-aligned, low-rank structure of neural network gradients. By applying PCA to gradient class-means, GradPCA effectively captures a class-specific subspace in gradient space, enabling reliable OOD detection with strong theoretical grounding. The authors provide a spectral OOD framework, detailing sufficient and necessary conditions, and demonstrate GradPCA’s robust, near state-of-the-art performance across CIFAR and ImageNet benchmarks, while highlighting the critical role of feature quality (pretrained vs non-pretrained representations). The work also offers practical guidance for detector design, emphasizes consistency over ad hoc performance, and releases open-source code to foster reproducibility and further research in principled OOD detection. Overall, GradPCA bridges NTK theory with spectral OOD detection, enabling more reliable and interpretable detection in real-world deep learning systems.

Abstract

We introduce GradPCA, an Out-of-Distribution (OOD) detection method that exploits the low-rank structure of neural network gradients induced by Neural Tangent Kernel (NTK) alignment. GradPCA applies Principal Component Analysis (PCA) to gradient class-means, achieving more consistent performance than existing methods across standard image classification benchmarks. We provide a theoretical perspective on spectral OOD detection in neural networks to support GradPCA, highlighting feature-space properties that enable effective detection and naturally emerge from NTK alignment. Our analysis further reveals that feature quality -- particularly the use of pretrained versus non-pretrained representations -- plays a crucial role in determining which detectors will succeed. Extensive experiments validate the strong performance of GradPCA, and our theoretical framework offers guidance for designing more principled spectral OOD detectors.

Paper Structure

This paper contains 70 sections, 10 theorems, 41 equations, 1 figure, 9 tables, 1 algorithm.

Key Result

Theorem 4.1

Let ${X}\sim\mu_{\textup{id}}$ and ${h}:\mathcal{X}\to\mathbb{R}^{P}$ be any function in $L^2(\mu_{\textup{id}})$. Consider the covariance matrix Let $\mathcal{P}{h}({x})$ be the orthogonal projection of ${h}({x})$ onto the range of $\mathbf{S}({h})$. For any ${x}\in\mathcal{X}$, if $\|\mathcal{P}{h}({x})\|_2<\|{h}({x})\|_2$ and ${h}$ is continuous at ${x}$, then ${x}$ is OOD.

Figures (1)

  • Figure 1: Performance comparison of OOD detection methods across multiple settings. For each method, the left bar shows the stacked average AUC scores$\uparrow$ on $6$ benchmarks described in Section \ref{['sec:experiments']} (in order from bottom to top): 1) CIFAR-10 BiT-M (pretrained, Table \ref{['table:cifar10']}), 2) CIFAR-10 TIMM (Table \ref{['table:cifar10']}), 3) CIFAR-100 BiT-M (pretrained, Table \ref{['table:cifar100']}), 4) CIFAR-100 TIMM (Table \ref{['table:cifar100']}), 5) ImageNet BiT-M (pretrained, Table \ref{['table:imagenet']}), 6) ImageNet BiT-S (Table \ref{['table:imagenet']}). The middle bar shows the stacked average FPR95 scores$\downarrow$ for each method. The right bar shows the stacked runtime per sample estimates (Table \ref{['table:runtime']}). The methods are ordered left to right by the average AUC score.

Theorems & Definitions (18)

  • Theorem 4.1: Sufficient condition for spectral OOD detection
  • Theorem 4.2: Robustness of PCA
  • Example 4.3: Best case
  • Example 4.4: Worst case
  • Theorem 4.5: Necessary condition for spectral OOD detection
  • Theorem B.1
  • proof
  • Theorem : Sufficient condition for spectral OOD detection
  • proof
  • Remark B.2
  • ...and 8 more