Table of Contents
Fetching ...

Learning by Steering the Neural Dynamics: A Statistical Mechanics Perspective

Mattia Scardecchia

TL;DR

This work develops a statistical mechanics view of learning in fully local, distributed recurrent networks with asymmetric couplings. It reveals a phase transition in the fixed-point structure as self-coupling $J_D$ increases: below a critical point, fixed points form isolated states with narrow clusters (OGP-like), while above it, dense, extensive clusters emerge and become accessible to local dynamics and fBP. Building on these insights, the thesis proposes a biologically plausible learning algorithm that maps inputs to fixed points through local plasticity, achieving entangled MNIST-like classification and exploiting depth to boost hetero-association capacity across architectures. The results connect algorithmic performance to the phase structure of the fixed-point landscape and point toward cortex-inspired self-coupling alternatives, opening avenues for scalable, energy-efficient, gradient-free learning in neural networks, with future work including spiking dynamics and neuroscience-inspired inductive biases.

Abstract

Despite the striking successes of deep neural networks trained with gradient-based optimization, these methods differ fundamentally from their biological counterparts. This gap raises key questions about how nature achieves robust, sample-efficient learning at minimal energy costs and solves the credit-assignment problem without backpropagation. We take a step toward bridging contemporary AI and computational neuroscience by studying how neural dynamics can support fully local, distributed learning that scales to simple machine-learning benchmarks. Using tools from statistical mechanics, we identify conditions for the emergence of robust dynamical attractors in random asymmetric recurrent networks. We derive a closed-form expression for the number of fixed points as a function of self-coupling strength, and we reveal a phase transition in their structure: below a critical self-coupling, isolated fixed points coexist with exponentially many narrow clusters showing the overlap-gap property; above it, subdominant yet dense and extensive clusters appear. These fixed points become accessible, including to a simple asynchronous dynamical rule, after an algorithm-dependent self-coupling threshold. Building on this analysis, we propose a biologically plausible algorithm for supervised learning with any binary recurrent network. Inputs are mapped to fixed points of the dynamics, by relaxing under transient external stimuli and stabilizing the resulting configurations via local plasticity. We show that our algorithm can learn an entangled version of MNIST, leverages depth to develop hierarchical representations and increase hetero-association capacity, and is applicable to several architectures. Finally, we highlight the strong connection between algorithm performance and the unveiled phase transition, and we suggest a cortex-inspired alternative to self-couplings for its emergence.

Learning by Steering the Neural Dynamics: A Statistical Mechanics Perspective

TL;DR

This work develops a statistical mechanics view of learning in fully local, distributed recurrent networks with asymmetric couplings. It reveals a phase transition in the fixed-point structure as self-coupling increases: below a critical point, fixed points form isolated states with narrow clusters (OGP-like), while above it, dense, extensive clusters emerge and become accessible to local dynamics and fBP. Building on these insights, the thesis proposes a biologically plausible learning algorithm that maps inputs to fixed points through local plasticity, achieving entangled MNIST-like classification and exploiting depth to boost hetero-association capacity across architectures. The results connect algorithmic performance to the phase structure of the fixed-point landscape and point toward cortex-inspired self-coupling alternatives, opening avenues for scalable, energy-efficient, gradient-free learning in neural networks, with future work including spiking dynamics and neuroscience-inspired inductive biases.

Abstract

Despite the striking successes of deep neural networks trained with gradient-based optimization, these methods differ fundamentally from their biological counterparts. This gap raises key questions about how nature achieves robust, sample-efficient learning at minimal energy costs and solves the credit-assignment problem without backpropagation. We take a step toward bridging contemporary AI and computational neuroscience by studying how neural dynamics can support fully local, distributed learning that scales to simple machine-learning benchmarks. Using tools from statistical mechanics, we identify conditions for the emergence of robust dynamical attractors in random asymmetric recurrent networks. We derive a closed-form expression for the number of fixed points as a function of self-coupling strength, and we reveal a phase transition in their structure: below a critical self-coupling, isolated fixed points coexist with exponentially many narrow clusters showing the overlap-gap property; above it, subdominant yet dense and extensive clusters appear. These fixed points become accessible, including to a simple asynchronous dynamical rule, after an algorithm-dependent self-coupling threshold. Building on this analysis, we propose a biologically plausible algorithm for supervised learning with any binary recurrent network. Inputs are mapped to fixed points of the dynamics, by relaxing under transient external stimuli and stabilizing the resulting configurations via local plasticity. We show that our algorithm can learn an entangled version of MNIST, leverages depth to develop hierarchical representations and increase hetero-association capacity, and is applicable to several architectures. Finally, we highlight the strong connection between algorithm performance and the unveiled phase transition, and we suggest a cortex-inspired alternative to self-couplings for its emergence.

Paper Structure

This paper contains 68 sections, 110 equations, 13 figures, 1 table, 2 algorithms.

Figures (13)

  • Figure 1: Solution of the saddle point equations for $q$ and $r$ in the RS ansatz. Fixed point iteration with damping coefficient $\alpha = 0.9$. Iteration is terminated when the relative change in the order parameters is less than $10^{-6}$. Left:$q^*$ as a function of $\beta$ and $J_D$. Right:$r^*$ as a function of $\beta$ and $J_D$.
  • Figure 2: Entropy density and Energy density as a function of $J_D$ for several values of $\beta$.
  • Figure 3: Logarithm of the number of fixed points, normalized by $N$, as a function of $J_D$.
  • Figure 4: Local entropy curves $S_I(q_1, m^*)$ as a function of the distance $\frac{1 - q_1}{2}$ for different values of the self-interaction strength $J_D$. $m^*$ is chosen to satisfy the zero-complexity criterion. The steep dotted line corresponds to the limit $J_D \to \infty$, in which all states are fixed points. Dotted parts of the curves are regions of numerical instability. The curves stop being monotonic for $J_D$ between 0.07 and 0.1.
  • Figure 5: Number of iterations needed by the dynamics \ref{['eq:simple_dynamics']} to converge to a fixed point as a function of the size $N$ of the network, for different values of the self-interaction strength $J_D$. For pairs of values $(N, J_D)$ for which we do not observe convergence, we do not plot the number of iterations.
  • ...and 8 more figures