Beyond the Class Subspace: Teacher-Guided Training for Reliable Out-of-Distribution Detection in Single-Domain Models

Hong Yang; Devroop Kar; Qi Yu; Travis Desell; Alex Ororbia

Beyond the Class Subspace: Teacher-Guided Training for Reliable Out-of-Distribution Detection in Single-Domain Models

Hong Yang, Devroop Kar, Qi Yu, Travis Desell, Alex Ororbia

Abstract

Out-of-distribution (OOD) detection methods perform well on multi-domain benchmarks, yet many practical systems are trained on single-domain data. We show that this regime induces a geometric failure mode, Domain-Sensitivity Collapse (DSC): supervised training compresses features into a low-rank class subspace and suppresses directions that carry domain-shift signal. We provide theory showing that, under DSC, distance- and logit-based OOD scores lose sensitivity to domain shift. We then introduce Teacher-Guided Training (TGT), which distills class-suppressed residual structure from a frozen multi-domain teacher (DINOv2) into the student during training. The teacher and auxiliary head are discarded after training, adding no inference overhead. Across eight single-domain benchmarks, TGT yields large far-OOD FPR@95 reductions for distance-based scorers: MDS improves by 11.61 pp, ViM by 10.78 pp, and kNN by 12.87 pp (ResNet-50 average), while maintaining or slightly improving in-domain OOD and classification accuracy.

Beyond the Class Subspace: Teacher-Guided Training for Reliable Out-of-Distribution Detection in Single-Domain Models

Abstract

Paper Structure (73 sections, 4 theorems, 10 equations, 3 figures, 50 tables)

This paper contains 73 sections, 4 theorems, 10 equations, 3 figures, 50 tables.

Introduction
Related Work
Out-of-Distribution Detection
Feature Geometry, Neural Collapse, and Representation Rank
Knowledge Distillation and Teacher--Student Frameworks
Fine-Tuning and OOD Detection
Domain-Sensitivity Collapse
Single-domain setting.
Mechanism and severity.
Connection to neural collapse.
Anisotropic Geometry and Distance Failure
Covariance decomposition.
Anisotropy measures.
Subspace decomposition.
Distance failure via variance--discriminability mismatch.
...and 58 more sections

Key Result

theorem 1

Let $\lambda_1 \ge \cdots \ge \lambda_d$ be the eigenvalues of $\mathrm{Cov}(z_{\mathrm{ID}})$ with eigenvectors $v_1,\ldots,v_d$. Suppose the ID-vs-OOD separation concentrates in a set of directions $\{v_j : j \in \mathcal{J}\}$ with $\lambda_j / \lambda_1 \le \rho$ for all $j \in \mathcal{J}$ and where $L_k \le 1$ is the Lipschitz constant of the $k$-NN distance statistic with respect to featur

Figures (3)

Figure 1: Per-dataset gains from Teacher-Guided Training on ResNet-50 relative to the CE baseline. Blue bars show effective-rank increase (TGT$-$CE), and hatched orange bars show FPR@95 reduction (CE$-$TGT; larger is better). The shown datasets are EuroSAT, Colon, Fashion, Tissue, and Rock (Rock shown in a separate $y$-region).
Figure 2: EuroSAT out-of-domain OOD (FarOOD) FPR@95 by method across teacher-guidance strengths $\lambda$, averaged over 5 random splits. The effect of $\lambda$ is method-dependent: for example, ReAct improves at higher $\lambda$, while SCALE worsens. Many methods also perform poorly at $\lambda=0.5$.
Figure S1: EuroSAT out-of-domain (FarOOD) FPR@95 by method across teacher-guidance strengths $\lambda$, averaged over 5 splits. The effect of $\lambda$ is method-dependent: for example, React improves at higher $\lambda$, while SCALE worsens. Many methods also perform poorly at $\lambda=0.5$.

Theorems & Definitions (7)

theorem 1: Distance failure under variance--discriminability mismatch
lemma 1
proposition 1: MSP/Energy insensitivity
remark 1: Interpretation and tightness
theorem 2: Distance failure
proof
proof

Beyond the Class Subspace: Teacher-Guided Training for Reliable Out-of-Distribution Detection in Single-Domain Models

Abstract

Beyond the Class Subspace: Teacher-Guided Training for Reliable Out-of-Distribution Detection in Single-Domain Models

Authors

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (7)