Table of Contents
Fetching ...

Geometrically Constrained Outlier Synthesis

Daniil Karzanov, Marcin Detyniecki

TL;DR

Geometrically Constrained Outlier Synthesis (GCOS), a training-time regularization framework aimed at improving OOD robustness during inference, naturally transitions to conformal OOD inference, which translates uncertainty scores into statistically valid p-values and enables thresholds with formal error guarantees, providing a pathway toward more predictable and reliable OOD detection.

Abstract

Deep neural networks for image classification often exhibit overconfidence on out-of-distribution (OOD) samples. To address this, we introduce Geometrically Constrained Outlier Synthesis (GCOS), a training-time regularization framework aimed at improving OOD robustness during inference. GCOS addresses a limitation of prior synthesis methods by generating virtual outliers in the hidden feature space that respect the learned manifold structure of in-distribution (ID) data. The synthesis proceeds in two stages: (i) a dominant-variance subspace extracted from the training features identifies geometrically informed, off-manifold directions; (ii) a conformally-inspired shell, defined by the empirical quantiles of a nonconformity score from a calibration set, adaptively controls the synthesis magnitude to produce boundary samples. The shell ensures that generated outliers are neither trivially detectable nor indistinguishable from in-distribution data, facilitating smoother learning of robust features. This is combined with a contrastive regularization objective that promotes separability of ID and OOD samples in a chosen score space, such as Mahalanobis or energy-based. Experiments demonstrate that GCOS outperforms state-of-the-art methods using standard energy-based inference on near-OOD benchmarks, defined as tasks where outliers share the same semantic domain as in-distribution data. As an exploratory extension, the framework naturally transitions to conformal OOD inference, which translates uncertainty scores into statistically valid p-values and enables thresholds with formal error guarantees, providing a pathway toward more predictable and reliable OOD detection.

Geometrically Constrained Outlier Synthesis

TL;DR

Geometrically Constrained Outlier Synthesis (GCOS), a training-time regularization framework aimed at improving OOD robustness during inference, naturally transitions to conformal OOD inference, which translates uncertainty scores into statistically valid p-values and enables thresholds with formal error guarantees, providing a pathway toward more predictable and reliable OOD detection.

Abstract

Deep neural networks for image classification often exhibit overconfidence on out-of-distribution (OOD) samples. To address this, we introduce Geometrically Constrained Outlier Synthesis (GCOS), a training-time regularization framework aimed at improving OOD robustness during inference. GCOS addresses a limitation of prior synthesis methods by generating virtual outliers in the hidden feature space that respect the learned manifold structure of in-distribution (ID) data. The synthesis proceeds in two stages: (i) a dominant-variance subspace extracted from the training features identifies geometrically informed, off-manifold directions; (ii) a conformally-inspired shell, defined by the empirical quantiles of a nonconformity score from a calibration set, adaptively controls the synthesis magnitude to produce boundary samples. The shell ensures that generated outliers are neither trivially detectable nor indistinguishable from in-distribution data, facilitating smoother learning of robust features. This is combined with a contrastive regularization objective that promotes separability of ID and OOD samples in a chosen score space, such as Mahalanobis or energy-based. Experiments demonstrate that GCOS outperforms state-of-the-art methods using standard energy-based inference on near-OOD benchmarks, defined as tasks where outliers share the same semantic domain as in-distribution data. As an exploratory extension, the framework naturally transitions to conformal OOD inference, which translates uncertainty scores into statistically valid p-values and enables thresholds with formal error guarantees, providing a pathway toward more predictable and reliable OOD detection.
Paper Structure (29 sections, 7 equations, 4 figures, 10 tables, 3 algorithms)

This paper contains 29 sections, 7 equations, 4 figures, 10 tables, 3 algorithms.

Figures (4)

  • Figure 1: GCOS Training Procedure Schematic. Illustration of the data flow in our online synthesis and regularization method. Epoch-level calibration on $\mathcal{D}_{calib}$ produces class-conditional subspace models $\mathcal{M}_{calib}$ and Mahalanobis quantiles $q$. During batch-level training, features are used to update a queue that generates proposer subspace models $\mathcal{M}_{train}$ and identifies off-manifold directions $v$. Outliers $\mathbf{z}_{ood}$ are synthesized to match the target quantiles $q$, as evaluated by $\mathcal{M}_{calib}$. The final regularization loss, $\mathcal{L}_{reg}$, is a contrastive objective computed on the energy scores of in-distribution batch features and the synthesized outliers $\mathbf{z}_{ood}$ and added to cross-entropy loss, $\mathcal{L}_{CE}$, from the main classification task.
  • Figure 2: UMAP Projection of Learned Features. Top: the overall feature space, showing that classes form varying shapes and are largely well separated. Bottom: a zoomed-in view highlighting the distribution of GCOS outliers in off-manifold regions and VOS outliers near cluster edges for two clusters. The panels illustrate how GCOS generates points in challenging regions beyond the main clusters, while VOS outliers remain close to class boundaries.
  • Figure 3: Example images from four datasets. First row in each box: in-distribution; second row: outliers.
  • Figure 4: Hyperparameter ablation study.