Dataset-learning duality and emergent criticality

Ekaterina Kukleva; Vitaly Vanchurin

Dataset-learning duality and emergent criticality

Ekaterina Kukleva, Vitaly Vanchurin

TL;DR

This work introduces a dataset-learning duality, a bulk-boundary mapping between non-trainable boundary states and the tangent space of trainable variables in neural networks, developed through activation and learning passes. By analyzing a local learning equilibrium and employing a probabilistic framework, it provides a Jacobian-based description of how boundary data induce fluctuations in trainable parameters, enabling a microscopic view of emergent criticality. The authors show that specific compositions of activation and loss functions can generate power-law fluctuations in the trainable variables, even when the dataset is non-critical, with analytical forms and supporting numerical experiments on a two-neuron toy model and a two-class dataset. This mechanism for scale-invariant learning dynamics suggests tunable routes to control criticality via activation or loss design and offers potential insights into critical phenomena in physical and biological systems.

Abstract

In artificial neural networks, the activation dynamics of non-trainable variables is strongly coupled to the learning dynamics of trainable variables. During the activation pass, the boundary neurons (e.g., input neurons) are mapped to the bulk neurons (e.g., hidden neurons), and during the learning pass, both bulk and boundary neurons are mapped to changes in trainable variables (e.g., weights and biases). For example, in feed-forward neural networks, forward propagation is the activation pass and backward propagation is the learning pass. We show that a composition of the two maps establishes a duality map between a subspace of non-trainable boundary variables (e.g., dataset) and a tangent subspace of trainable variables (i.e., learning). In general, the dataset-learning duality is a complex non-linear map between high-dimensional spaces. We use duality to study the emergence of criticality, or the power-law distribution of fluctuations of the trainable variables, using a toy model at learning equilibrium. In particular, we show that criticality can emerge in the learning system even from the dataset in a non-critical state, and that the power-law distribution can be modified by changing either the activation function or the loss function.

Dataset-learning duality and emergent criticality

TL;DR

Abstract

Paper Structure (12 sections, 73 equations, 5 figures)

This paper contains 12 sections, 73 equations, 5 figures.

Introduction
Neural networks
Dataset-learning duality
Distribution of fluctuations
Toy model
Emergent criticality
Numerical results
Exponential composition, $k=1$
Power-law composition, $k=0; \frac{2}{3}$
Logarithmic composition, $k=2$
Discussion
Approximation of an integral

Figures (5)

Figure 1: Distribution of fluctuations for composition of sigmoid activation and mean squared loss functions.
Figure 2: Distribution of fluctuations for composition of sigmoid activation and cross entropy loss functions.
Figure 3: Distribution of fluctuations for composition of sigmoid activation and cross entropy loss functions.
Figure 4: Distribution of fluctuations for composition of sigmoid activation and cross entropy loss functions.
Figure 5: Distribution of fluctuations for composition of sigmoid activation and cross entropy loss functions.

Dataset-learning duality and emergent criticality

TL;DR

Abstract

Dataset-learning duality and emergent criticality

Authors

TL;DR

Abstract

Table of Contents

Figures (5)