Table of Contents
Fetching ...

An Analytic Solution to Covariance Propagation in Neural Networks

Oren Wright, Yorie Nakahira, José M. F. Moura

TL;DR

A sample-free moment propagation technique that propagates mean vectors and covariance matrices across a network to accurately characterize the input-output distributions of neural networks.

Abstract

Uncertainty quantification of neural networks is critical to measuring the reliability and robustness of deep learning systems. However, this often involves costly or inaccurate sampling methods and approximations. This paper presents a sample-free moment propagation technique that propagates mean vectors and covariance matrices across a network to accurately characterize the input-output distributions of neural networks. A key enabler of our technique is an analytic solution for the covariance of random variables passed through nonlinear activation functions, such as Heaviside, ReLU, and GELU. The wide applicability and merits of the proposed technique are shown in experiments analyzing the input-output distributions of trained neural networks and training Bayesian neural networks.

An Analytic Solution to Covariance Propagation in Neural Networks

TL;DR

A sample-free moment propagation technique that propagates mean vectors and covariance matrices across a network to accurately characterize the input-output distributions of neural networks.

Abstract

Uncertainty quantification of neural networks is critical to measuring the reliability and robustness of deep learning systems. However, this often involves costly or inaccurate sampling methods and approximations. This paper presents a sample-free moment propagation technique that propagates mean vectors and covariance matrices across a network to accurately characterize the input-output distributions of neural networks. A key enabler of our technique is an analytic solution for the covariance of random variables passed through nonlinear activation functions, such as Heaviside, ReLU, and GELU. The wide applicability and merits of the proposed technique are shown in experiments analyzing the input-output distributions of trained neural networks and training Bayesian neural networks.
Paper Structure (22 sections, 2 theorems, 49 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 22 sections, 2 theorems, 49 equations, 6 figures, 5 tables, 1 algorithm.

Key Result

Theorem 1

Suppose $\mathbf{y} \in \mathbb{R}^{n}$ is a multivariate Gaussian random vector, $\mathbf{y} \sim \mathcal{N}(\boldsymbol{\mu}, \mathbf{\Sigma)}$, and $\mathbf{z} = \mathbf{g}(\mathbf{y})$ is an element-wise independent and identical function of $\mathbf{y}$. If the mean and variance of any input e

Figures (6)

  • Figure 1: Moment propagation for a single network layer with an affine transformation followed by a nonlinear activation function. Layer-by-layer mean and covariance statistics can be used as a measure of uncertainty traveling across a network.
  • Figure 2: The covariance between a bivariate Gaussian passed through a ReLU, calculated numerically, plotted over input means $\mu_1$ and $\mu_2$, with fixed input parameters $\sigma_1 = 1, \sigma_2 = 1, \rho=0.5$.
  • Figure 3: The (a) maximum and (b) mean absolute error of Theorem \ref{['thm:cov']} applied to a ReLU by Taylor order. Error is determined by comparing against the numerical calculation depicted in Figure \ref{['fig:covar']}, over $\mu_1,\mu_2 \in \left[ -5, 5 \right]$.
  • Figure 4: Absolute error for (a) a fourth-order expansion of Theorem \ref{['thm:cov']} applied to a ReLU and (b) the ReLU approximation proposed by wu2019, plotted over input means $\mu_1$ and $\mu_2$. Error is determined by comparing against the numerical calculation depicted in Figure \ref{['fig:covar']}.
  • Figure 5: Absolute error of a fourth-order expansion of Theorem \ref{['thm:cov']} applied to (a) a GELU and (b) an approximated sigmoid daunizeau_semi-analytical_2017, plotted over input means $\mu_1$ and $\mu_2$ and using the fixed input parameters of Figure \ref{['fig:covar']}.
  • ...and 1 more figures

Theorems & Definitions (5)

  • Definition 1
  • Theorem 1
  • Corollary 1.1
  • proof
  • proof