Is Learning in Biological Neural Networks based on Stochastic Gradient Descent? An analysis using stochastic processes

Sören Christensen; Jan Kallsen

Is Learning in Biological Neural Networks based on Stochastic Gradient Descent? An analysis using stochastic processes

Sören Christensen, Jan Kallsen

TL;DR

The paper investigates whether learning in biological neural networks can be understood as stochastic gradient descent. It analyzes the Schmidt-Hieber STDP-based model and shows that when a learning opportunity triggers many small updates (many spikes), the aggregated dynamics approximate a gradient-flow, specifically $\frac{d\mathbf{Z}_t}{dt} = -\frac{2}{3}A^2 \alpha \nabla L(\mathbf{Z}_t)$. A key result provides a finite-sample convergence rate $\sqrt{\mathbb{E}(\sup_t \|Z^n_t - Z_t\|^2)} \le c \sqrt{d/n}$, indicating SGD-like optimization in BNNs under high spike density. This work thus reconciles biological plausibility with gradient-based learning, highlighting the role of spike count and stochasticity in enabling efficient learning without explicit backpropagation.

Abstract

In recent years, there has been an intense debate about how learning in biological neural networks (BNNs) differs from learning in artificial neural networks. It is often argued that the updating of connections in the brain relies only on local information, and therefore a stochastic gradient-descent type optimization method cannot be used. In this paper, we study a stochastic model for supervised learning in BNNs. We show that a (continuous) gradient step occurs approximately when each learning opportunity is processed by many local updates. This result suggests that stochastic gradient descent may indeed play a role in optimizing BNNs.

Is Learning in Biological Neural Networks based on Stochastic Gradient Descent? An analysis using stochastic processes

TL;DR

. A key result provides a finite-sample convergence rate

, indicating SGD-like optimization in BNNs under high spike density. This work thus reconciles biological plausibility with gradient-based learning, highlighting the role of spike count and stochasticity in enabling efficient learning without explicit backpropagation.

Abstract

Paper Structure (3 sections, 2 theorems, 28 equations)

This paper contains 3 sections, 2 theorems, 28 equations.

Introduction
The Schmidt-Hieber model for BNNs revisited
Multiple updates per learning opportunity

Key Result

Theorem 1

Assume eq:rate and ass:1. Then, for each fixed training sample $k$, the rescaled process $\mathbf{Z}^n$ of the BNN weights converges to the rescaled gradient process $\mathbf{Z}$ uniformly in $L^2$, i.e. More specifically, holds for some constant $c<\infty$ which depends only on $\alpha, A, \lambda$. Here $d$ denotes the number of edges $\nu=(i,j)$ in the network.

Theorems & Definitions (4)

Theorem 1
proof : Proof of Theorem \ref{['thm:convergence']}
Lemma 2
proof

Is Learning in Biological Neural Networks based on Stochastic Gradient Descent? An analysis using stochastic processes

TL;DR

Abstract

Is Learning in Biological Neural Networks based on Stochastic Gradient Descent? An analysis using stochastic processes

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (4)