Table of Contents
Fetching ...

Simulating Posterior Bayesian Neural Networks with Dependent Weights

Nicola Apollonio, Giovanni Franzina, Giovanni Luca Torrisi

Abstract

We consider fully connected and feedforward deep neural networks with dependent and possibly heavy-tailed weights, as introduced in Lee et al. 2023, to address limitations of the standard Gaussian prior. It has been proved in Lee et al. 2023 that, as the number of nodes in the hidden layers grows large, according to a sequential and ordered limit, the law of the output converges weakly to a Gaussian mixture. Among our results, we present sufficient conditions on the model parameters (the activation function and the associated Lévy measures) which ensure that the sequential limit is independent of the order. Next, we study the neural network through the lens of the posterior distribution with a Gaussian likelihood. If the random covariance matrix of the infinite-width limit is positive definite under the prior, we identify the posterior distribution of the output in the wide-width limit according to a sequential regime. Remarkably, we provide mild sufficient conditions to ensure the aforementioned invertibility of the random covariance matrix under the prior, thereby extending the results in L. Carvalho et al. 2025. We illustrate our findings using numerical simulations.

Simulating Posterior Bayesian Neural Networks with Dependent Weights

Abstract

We consider fully connected and feedforward deep neural networks with dependent and possibly heavy-tailed weights, as introduced in Lee et al. 2023, to address limitations of the standard Gaussian prior. It has been proved in Lee et al. 2023 that, as the number of nodes in the hidden layers grows large, according to a sequential and ordered limit, the law of the output converges weakly to a Gaussian mixture. Among our results, we present sufficient conditions on the model parameters (the activation function and the associated Lévy measures) which ensure that the sequential limit is independent of the order. Next, we study the neural network through the lens of the posterior distribution with a Gaussian likelihood. If the random covariance matrix of the infinite-width limit is positive definite under the prior, we identify the posterior distribution of the output in the wide-width limit according to a sequential regime. Remarkably, we provide mild sufficient conditions to ensure the aforementioned invertibility of the random covariance matrix under the prior, thereby extending the results in L. Carvalho et al. 2025. We illustrate our findings using numerical simulations.

Paper Structure

This paper contains 20 sections, 13 theorems, 138 equations, 2 figures.

Key Result

Lemma 3.1

Let $a_1,\ldots,a_n\in\mathbb R^p$, with $n\geq p$, be $p$-dimensional (column) vectors and let $c_1,\ldots,c_n$ be arbitrarily chosen positive numbers. Then the $p\times p$ matrix $\sum_{i=1}^{n} c_i a_i a_i^\top$ is positive definite if and only if the $p\times n$ matrix $(a_1,\ldots,a_n)$ has ran

Figures (2)

  • Figure 1: Simulation of Model 1. The non-green curves are estimates of the one dimensional marginal distribution functions (with $n_1=n$ nodes in first hidden layer and $n_2=2n$ in second hidden layer, for different values of $n=4,8,16,32$) of the distribution function of the posterior law of the $3$-variate output $Z^{(3)}(\bold x)$. The green curves are the one dimensional marginal distribution functions of the distribution function of the $3$-variate infinite-width limit $G^{(3)}(\bold x)$. All the parameters are specified in Section \ref{['ss:model1']}.
  • Figure 2: Simulation of Model 2. The non-green curves are estimates of the one dimensional marginal distribution functions (for different values $n_1=n=2,4,8,16, 32$ of the number of nodes in the hidden layer) of the posterior law of the $3$-variate output $Z^{(2)}(\bold x)$. The green curves are the one dimensional marginal distribution functions of the distribution function of the $3$-variate infinite-width limit $G^{(2)}(\bold x)$. All the parameters are specified in Section \ref{['ss:model2']}.

Theorems & Definitions (26)

  • Lemma 3.1
  • Theorem 4.1
  • Theorem 4.2
  • Remark 4.3
  • Theorem 5.1
  • Remark 5.2
  • Remark 5.3
  • Remark 5.4
  • Lemma 5.5
  • proof
  • ...and 16 more