Table of Contents
Fetching ...

ReLU Networks as Random Functions: Their Distribution in Probability Space

Shreyas Chaudhari, José M. F. Moura

TL;DR

This work treats ReLU networks as conditional affine maps whose realized form depends on the random input distribution. By deriving the activation-pattern PMF via Gaussian orthant probabilities and showing that outputs form a mixture of truncated Gaussians, the authors provide explicit, numerically tractable expressions for both function- and output-level distributions under input uncertainty. They propose a sample-free activation-pattern support approximation to efficiently identify the most probable affine regions, enabling scalable robustness and reliability analyses. The approach is validated on synthetic moons data and real-world datasets (MNIST, Fashion-MNIST), with additional experiments on Jacobian spectra under noisy inputs, highlighting practical applicability to interpretability and uncertainty quantification in piecewise linear networks.

Abstract

This paper presents a novel framework for understanding trained ReLU networks as random, affine functions, where the randomness is induced by the distribution over the inputs. By characterizing the probability distribution of the network's activation patterns, we derive the discrete probability distribution over the affine functions realizable by the network. We extend this analysis to describe the probability distribution of the network's outputs. Our approach provides explicit, numerically tractable expressions for these distributions in terms of Gaussian orthant probabilities. Additionally, we develop approximation techniques to identify the support of affine functions a trained ReLU network can realize for a given distribution of inputs. Our work provides a framework for understanding the behavior and performance of ReLU networks corresponding to stochastic inputs, paving the way for more interpretable and reliable models.

ReLU Networks as Random Functions: Their Distribution in Probability Space

TL;DR

This work treats ReLU networks as conditional affine maps whose realized form depends on the random input distribution. By deriving the activation-pattern PMF via Gaussian orthant probabilities and showing that outputs form a mixture of truncated Gaussians, the authors provide explicit, numerically tractable expressions for both function- and output-level distributions under input uncertainty. They propose a sample-free activation-pattern support approximation to efficiently identify the most probable affine regions, enabling scalable robustness and reliability analyses. The approach is validated on synthetic moons data and real-world datasets (MNIST, Fashion-MNIST), with additional experiments on Jacobian spectra under noisy inputs, highlighting practical applicability to interpretability and uncertainty quantification in piecewise linear networks.

Abstract

This paper presents a novel framework for understanding trained ReLU networks as random, affine functions, where the randomness is induced by the distribution over the inputs. By characterizing the probability distribution of the network's activation patterns, we derive the discrete probability distribution over the affine functions realizable by the network. We extend this analysis to describe the probability distribution of the network's outputs. Our approach provides explicit, numerically tractable expressions for these distributions in terms of Gaussian orthant probabilities. Additionally, we develop approximation techniques to identify the support of affine functions a trained ReLU network can realize for a given distribution of inputs. Our work provides a framework for understanding the behavior and performance of ReLU networks corresponding to stochastic inputs, paving the way for more interpretable and reliable models.

Paper Structure

This paper contains 15 sections, 4 theorems, 27 equations, 3 figures, 3 tables, 3 algorithms.

Key Result

Lemma 1

An input $\mathbf{x}$ induces activation pattern $\boldsymbol{\zeta}'$ if and only if it is in the convex polytope:

Figures (3)

  • Figure 1: Effect of Gaussian noise on affine function distribution and output distribution. Blue lines are the numerically computed expressions and red lines are Monte Carlo estimates. [Left Column]: Gaussian distribution input to the network. [Middle Column]: Resulting distribution over activation patterns (affine functions). Each binary activation pattern is denoted by its decimal representation. [Right Column]: CDF of the network output.
  • Figure 2: PMF of affine functions and CDF of outputs corresponding to Gaussian mixture models fit on: [Left 2] blue class and [Right 2] orange class
  • Figure 3: Singular value distributions. Blue histogram is for numerically computed expression and orange histogram results from Monte Carlo simulation.

Theorems & Definitions (9)

  • Definition 1
  • Lemma 1
  • Definition 2
  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • proof