Table of Contents
Fetching ...

Flow-Induced Diagonal Gaussian Processes

Moule Lin, Andrea Patane, Weipeng Jing, Shuhao Guan, Goetz Botterweck

TL;DR

FiD‑GP tackles unreliable predictive uncertainty in deep networks by integrating a flow‑based variational posterior over a compact inducing‑weight matrix with a Gaussian GP prior. The approach employs a Kronecker‑structured covariance and spectral regularisation to capture rich feature correlations while enabling efficient inference, and it introduces a single‑pass projection for Out‑of‑Distribution detection with a spectral‑residual bound. Empirically, FiD‑GP achieves state‑of‑the‑art or competitive accuracy and uncertainty calibration across regression, image classification, and semantic segmentation, while compressing model parameters by roughly $51\%$ and storage by about $75\%$. The method yields near‑perfect OoD discrimination in several benchmarks and provides substantial practical impact by reducing Bayesian training costs and enabling scalable uncertainty estimation in resource‑constrained settings.

Abstract

We present Flow-Induced Diagonal Gaussian Processes (FiD-GP), a compression framework that incorporates a compact inducing weight matrix to project a neural network's weight uncertainty into a lower-dimensional subspace. Critically, FiD-GP relies on normalising-flow priors and spectral regularisations to augment its expressiveness and align the inducing subspace with feature-gradient geometry through a numerically stable projection mechanism objective. Furthermore, we demonstrate how the prediction framework in FiD-GP can help to design a single-pass projection for Out-of-Distribution (OoD) detection. Our analysis shows that FiD-GP improves uncertainty estimation ability on various tasks compared with SVGP-based baselines, satisfies tight spectral residual bounds with theoretically guaranteed OoD detection, and significantly compresses the neural network's storage requirements at the cost of increased inference computation dependent on the number of inducing weights employed. Specifically, in a comprehensive empirical study spanning regression, image classification, semantic segmentation, and out-of-distribution detection benchmarks, it cuts Bayesian training cost by several orders of magnitude, compresses parameters by roughly 51%, reduces model size by about 75%, and matches state-of-the-art accuracy and uncertainty estimation.

Flow-Induced Diagonal Gaussian Processes

TL;DR

FiD‑GP tackles unreliable predictive uncertainty in deep networks by integrating a flow‑based variational posterior over a compact inducing‑weight matrix with a Gaussian GP prior. The approach employs a Kronecker‑structured covariance and spectral regularisation to capture rich feature correlations while enabling efficient inference, and it introduces a single‑pass projection for Out‑of‑Distribution detection with a spectral‑residual bound. Empirically, FiD‑GP achieves state‑of‑the‑art or competitive accuracy and uncertainty calibration across regression, image classification, and semantic segmentation, while compressing model parameters by roughly and storage by about . The method yields near‑perfect OoD discrimination in several benchmarks and provides substantial practical impact by reducing Bayesian training costs and enabling scalable uncertainty estimation in resource‑constrained settings.

Abstract

We present Flow-Induced Diagonal Gaussian Processes (FiD-GP), a compression framework that incorporates a compact inducing weight matrix to project a neural network's weight uncertainty into a lower-dimensional subspace. Critically, FiD-GP relies on normalising-flow priors and spectral regularisations to augment its expressiveness and align the inducing subspace with feature-gradient geometry through a numerically stable projection mechanism objective. Furthermore, we demonstrate how the prediction framework in FiD-GP can help to design a single-pass projection for Out-of-Distribution (OoD) detection. Our analysis shows that FiD-GP improves uncertainty estimation ability on various tasks compared with SVGP-based baselines, satisfies tight spectral residual bounds with theoretically guaranteed OoD detection, and significantly compresses the neural network's storage requirements at the cost of increased inference computation dependent on the number of inducing weights employed. Specifically, in a comprehensive empirical study spanning regression, image classification, semantic segmentation, and out-of-distribution detection benchmarks, it cuts Bayesian training cost by several orders of magnitude, compresses parameters by roughly 51%, reduces model size by about 75%, and matches state-of-the-art accuracy and uncertainty estimation.

Paper Structure

This paper contains 28 sections, 2 theorems, 81 equations, 6 figures, 7 tables, 2 algorithms.

Key Result

Lemma 1

Let Define the projector and let $g$ be a 1‑Lipschitz flow. Set If during training then strictly separating ID and OoD samples.

Figures (6)

  • Figure 1: Distribution of predictive scores generated by the ResNet-18 model equipped with Sparse Variational Gaussian Processes (SVGP). In-distribution (ID) dataset is CIFAR-100, Out-of-Distribution (OoD) dataset is CIFAR-10.
  • Figure 2: Overview of our approach (FiD-GP): Flow-based conditional Gaussian with spectral control and projection residual.
  • Figure 3: Synthetic 1-D regression: true $f(x)=\cos(4x+0.8)$ (black); noisy training (blue) with error bars; test (orange) with error lines; FiD-GP mean (dashed blue) and $\pm3\sigma$ interval (shaded). Left: inducing grid $R\times C=64\times64$ (rows$\times$columns), $\lambda_{max}=0.08$, $\mathrm{prior\_sd}=0.5$. Right:$R\times C=32\times32$, $\lambda_{max}=0.05$, $\mathrm{prior\_sd}=0.3$.
  • Figure 4: Distributions of predictive scores from ResNet-18 + FiD-GP (Reparam, 4 layers) on CIFAR-100 (ID) and CIFAR-10 (OoD).
  • Figure 5: CamVID Ground Truth (left) and Uncertainty Map (right) generated by FCN-ResNet50 (Matheron, 4 layers).
  • ...and 1 more figures

Theorems & Definitions (3)

  • Definition 1: Normalising Flow with Spectral Normalisation for the Variational Posterior
  • Lemma 1: Spectral Residual Separation
  • Lemma 2: Conditional-Gaussian Preservation