Table of Contents
Fetching ...

Unlocked Backpropagation using Wave Scattering

Christian Pehle, Jean-Jacques Slotine

TL;DR

This work reframes backpropagation as a hyperbolic relaxation of the Pontryagin Maximum Principle by introducing an optimization-time dimension $\tau$, turning the training problem into a worldsheet with finite propagation speed $c$ and counter-propagating waves.The authors develop a discretized worldsheet algorithm with impedance-matched layer junctions and wave-residual sources, enabling fully unlocked, local updates that require only nearest-neighbor communication.A Minimal Reflection principle shows parameter updates arise from boundary-condition matching, identifying gradient descent, Newton, and momentum as special cases of impedance matching and energy dissipation, and offering a physical intuition for optimizer behavior.The framework connects to a broad spectrum of related ideas—from parallel-in-time methods and brain-inspired learning to passivity and analog optimization—while pointing to practical implications for parallel hardware and potential neuromorphic/substrate implementations.Overall, the approach provides a principled route to asynchronous, locally computable optimization grounded in wave dynamics, with clear theoretical interpretations and guidance for future numerical and hardware explorations.

Abstract

Both the backpropagation algorithm in machine learning and the maximum principle in optimal control theory are posed as a two-point boundary problem, resulting in a "forward-backward" lock. We derive a reformulation of the maximum principle in optimal control theory as a hyperbolic initial value problem by introducing an additional "optimization time" dimension. We introduce counter-propagating wave variables with finite propagation speed and recast the optimization problem in terms of scattering relationships between them. This relaxation of the original problem can be interpreted as a physical system that equilibrates and changes its physical properties in order to minimize reflections. We discretize this continuum theory to derive a family of fully unlocked algorithms suitable for training neural networks. Different parameter dynamics, including gradient descent, can be derived by demanding dissipation and minimization of reflections at parameter ports. These results also imply that any physical substrate that supports the scattering and dissipation of waves can be interpreted as solving an optimization problem.

Unlocked Backpropagation using Wave Scattering

TL;DR

This work reframes backpropagation as a hyperbolic relaxation of the Pontryagin Maximum Principle by introducing an optimization-time dimension $\tau$, turning the training problem into a worldsheet with finite propagation speed $c$ and counter-propagating waves.The authors develop a discretized worldsheet algorithm with impedance-matched layer junctions and wave-residual sources, enabling fully unlocked, local updates that require only nearest-neighbor communication.A Minimal Reflection principle shows parameter updates arise from boundary-condition matching, identifying gradient descent, Newton, and momentum as special cases of impedance matching and energy dissipation, and offering a physical intuition for optimizer behavior.The framework connects to a broad spectrum of related ideas—from parallel-in-time methods and brain-inspired learning to passivity and analog optimization—while pointing to practical implications for parallel hardware and potential neuromorphic/substrate implementations.Overall, the approach provides a principled route to asynchronous, locally computable optimization grounded in wave dynamics, with clear theoretical interpretations and guidance for future numerical and hardware explorations.

Abstract

Both the backpropagation algorithm in machine learning and the maximum principle in optimal control theory are posed as a two-point boundary problem, resulting in a "forward-backward" lock. We derive a reformulation of the maximum principle in optimal control theory as a hyperbolic initial value problem by introducing an additional "optimization time" dimension. We introduce counter-propagating wave variables with finite propagation speed and recast the optimization problem in terms of scattering relationships between them. This relaxation of the original problem can be interpreted as a physical system that equilibrates and changes its physical properties in order to minimize reflections. We discretize this continuum theory to derive a family of fully unlocked algorithms suitable for training neural networks. Different parameter dynamics, including gradient descent, can be derived by demanding dissipation and minimization of reflections at parameter ports. These results also imply that any physical substrate that supports the scattering and dissipation of waves can be interpreted as solving an optimization problem.
Paper Structure (22 sections, 5 theorems, 57 equations, 1 algorithm)

This paper contains 22 sections, 5 theorems, 57 equations, 1 algorithm.

Key Result

Theorem 2.1

Consider the control problem of minimizing a cost functional where $x$ is the state and $u$ the control input, subject to the dynamics Let $H(x,\lambda,u) = L(x,u) + \lambda^\top f(x,u)$ be the Hamiltonian. The necessary conditions for optimality are These equations constitute a Two-Point Boundary Value Problem (TPBVP) due to the split boundary conditions at $t=0$ and $t=T$.

Theorems & Definitions (11)

  • Theorem 2.1: Pontryagin Maximum Principle
  • Definition 2.2: Worldsheet $\Sigma$
  • Definition 2.3: Wave variables and wave residuals
  • Proposition 2.4: Cross-coupled $\tau$-dynamics yields forward/backward waves
  • Lemma 2.5: Boundary conditions in scattering form
  • Theorem 2.6: Wave energy balance and dissipation
  • Definition 4.1: Naive Discretized Worldsheet
  • Definition 4.2: Parameter Port Variables
  • Definition 4.3: Parameter Scattering
  • Theorem 4.4: Minimal Reflection Yields Gradient Descent
  • ...and 1 more