Tuning Universality in Deep Neural Networks

Arsham Ghavasieh

Tuning Universality in Deep Neural Networks

Arsham Ghavasieh

TL;DR

A stochastic theory of deep information propagation (DIP) by incorporating Central Limit Theorem (CLT)-level fluctuations is derived and it is demonstrated that activation function design controls the collective dynamics in random DNNs.

Abstract

Deep neural networks (DNNs) exhibit crackling-like avalanches whose origin lacks a mechanistic explanation. Here, I derive a stochastic theory of deep information propagation (DIP) by incorporating Central Limit Theorem (CLT)-level fluctuations. Four effective couplings $(r, h, D_1, D_2)$ characterize the dynamics, yielding a Landau description of the static exponents and a Directed Percolation (DP) structure of activity cascades. Tuning the couplings selects between avalanche dynamics generated by a Brownian Motion (BM) in a logarithmic trap and an absorbed free BM, each corresponding to a distinct universality classes. Numerical simulations confirm the theory and demonstrate that activation function design controls the collective dynamics in random DNNs.

Tuning Universality in Deep Neural Networks

TL;DR

Abstract

characterize the dynamics, yielding a Landau description of the static exponents and a Directed Percolation (DP) structure of activity cascades. Tuning the couplings selects between avalanche dynamics generated by a Brownian Motion (BM) in a logarithmic trap and an absorbed free BM, each corresponding to a distinct universality classes. Numerical simulations confirm the theory and demonstrate that activation function design controls the collective dynamics in random DNNs.

Tuning Universality in Deep Neural Networks

TL;DR

Abstract

Tuning Universality in Deep Neural Networks

TL;DR

Abstract

Paper Structure

Figures (1)