Table of Contents
Fetching ...

Contrastive learning in tunable dynamical systems

Menachem Stern, Adam G. Frim, Raúl Candás, Andrea J. Liu, Vijay Balasubramanian

Abstract

We generalize the theory of supervised contrastive learning, previously applied to physical systems at equilibrium or steady state, to systems following any dynamics described by coupled ordinary differential equations. We show that if physical dynamics break time reversal symmetry, gradient descent on a cost function embodying the desired behavior cannot be achieved with a scalable process, even in principle. We therefore introduce Probably Approximately Right (PAR) learning processes, composed of a local contrastive learning rule and a scalable supervision protocol. We show that approximate, local supervision with forward propagation of the error signal can be used to successfully train several tunable models of physical dynamics inspired by examples in biological and machine learning.

Contrastive learning in tunable dynamical systems

Abstract

We generalize the theory of supervised contrastive learning, previously applied to physical systems at equilibrium or steady state, to systems following any dynamics described by coupled ordinary differential equations. We show that if physical dynamics break time reversal symmetry, gradient descent on a cost function embodying the desired behavior cannot be achieved with a scalable process, even in principle. We therefore introduce Probably Approximately Right (PAR) learning processes, composed of a local contrastive learning rule and a scalable supervision protocol. We show that approximate, local supervision with forward propagation of the error signal can be used to successfully train several tunable models of physical dynamics inspired by examples in biological and machine learning.

Paper Structure

This paper contains 18 sections, 48 equations, 13 figures.

Figures (13)

  • Figure 1: Dynamical contrastive learning. We consider dynamical systems in which physical variables $x_a(t)$, indexed by $a$, are coupled through learning degrees of freedom, $w_{ab}$. The effect of $x_a(t)$ on $x_b(t)$, set by $w_{ab}$, may be different from the effect of $x_b(t)$ on $x_a(t)$, set by $w_{ba}$. The system learns by adapting $\vec{w}$ based on comparison of a free trajectory $\vec{x}^F(t)$ and a clamped trajectory $\vec{x}^C(t)$ nudged by an external supervisor (Eq. \ref{['eq:LR3']}). During learning, the free trajectory iteratively approaches the clamped trajectory, reducing a cost function.
  • Figure 2: Training reciprocal linear dynamical systems for an output response proportional by a factor $p$ to the input signal. ( a) The desired output signal (dashed red line) has double the amplitude ($p=2$) of the sinusoidal input signal at the input node, $x_1$ (solid blue line). Before training, the output response $x_{10}$ (orange) in the initialized network is weak. During training, the output node is clamped with a nudge from the free output signal towards the desired signal. After one training period of the input signal, the weights are adjusted according to the local rule of Eq. \ref{['eq:LR3']} and the clamping on the output node is adjusted according to the forward supervisor of Eq. \ref{['eq:NewSup2']}. This process is repeated until the desired output signal is achieved within acceptable error. This training paradigm is applied to each of the examples in the following figures unless otherwise noted in the main text. ( b) After training, the output oscillator (orange) matches the desired trajectory and the hidden oscillators have weak responses (thin lines near 0). ( c) After training, the network generalizes the amplitude-doubling to random input data (blue): the output (orange) matches the desired output (red dashed) well.
  • Figure 3: Training a linear dynamical system to promote a desired temporal lag in the output. ( a) The desired output signal (dashed red line) lags the input signal, $F$ (blue) by a time interval. Before training, the response of the output node, $x_{10}$ (orange), and hidden nodes (thin colored lines) in the initialized network are weak ( b) After training, the output node reproduces the desired trajectory. ( c) After training is complete, the network generalizes to a longer time window for an input sinusoid at a different initial phase and for random initial node positions. The network also generalizes correctly for input ( d) triangle waves, ( e) impulse signals, and ( f) composite waves. ( g) (Top) Response of "output node," $x_{10}$ (orange) when a sinusoid is injected at "input node" $x_1$, (blue) vs. (Bottom) response of previous "input node" $x_1$ (green) when a sinusoid is instead injected at the previous "output node" $x_{10}$ (purple), showing reciprocal time lags in a trained reciprocal system. ( h) Same as (g) but for a non-reciprocal network trained for two different time lags, depending on whether a signal is injected at one node or the other.
  • Figure 4: Training a network of Kuramoto oscillators so that the output oscillator mirrors the dynamics applied at the input oscillator. ( a) Before training, the network has weak couplings between nearest neighbors. We want to transmit a signal from an input oscillator (blue) to on output oscillator (orange) on the opposite side of the network. ( b) Before training, a time-varying signal at the input oscillator (blue), produces a weak response at the target (dashed orange). The responses of nearly all oscillators in the network are also weak (thin lines). ( c) After training, some network couplings strengthen or weaken significantly, indicated by edge colors. ( d) The trained network achieves a dynamical response at the output (dashed orange) that is identical to the input signal (blue solid), as desired, by synchronizing nodes throughout the network.
  • Figure 5: Training snapshots for a network of $N=100$ Kuramoto oscillators spatially organized in two dimensions and coupled locally with an average of $Z=4.62$ neighbors. The oscillators have intrinsic frequencies $\omega_a$ drawn from a normal distribution with mean zero and unit variance and are initially distributed uniformly in phase with vanishing couplings. The desired behavior is for the oscillators to synchronize to a common global frequency, here $\Omega = \dot{\psi} = 1$ so the desired period of oscillation is $2\pi$; the desired phase for each oscillator as a function of time is shown for five periods (the duration of the trajectory, $T$) by the red dashed line in each plot in the top row. Early in training (left column) no synchronization occurs because the oscillators are decoupled and none of them has the desired phase behavior. As training progresses (second, third and fourth columns), the couplings adjust so that nearly all of the oscillators acquire the desired behavior. ( Top row) Desired phase (red dashed), actual phase of the Kuramoto order parameter $\psi(t)$ (Eq. \ref{['eq:Kuramotoorderparameter']}) (blue), and phases of individual oscillators (faded colors), over five periods of the desired oscillation, after the number of training periods, each of duration $T$, indicated by the top labels. At the end of training the oscillators have synchronized to the desired frequency. ( Second row) Magnitude ($r$) and frequency ($\dot{\psi}$) of the Kuramoto order parameter over time during the same time intervals depicted in the top row. The magnitude, $r$, grows with training and the frequency approaches the desired value $\Omega \equiv \dot{\psi} \rightarrow 1 = \omega_\text{sync}$, indicating synchrony at the desired frequency. ( Third row) Local coupling strengths $K_{ab}$ (blue = negative, red = positive) between oscillators (circles), indicated at a time $t=T/2$ within each training iteration. The couplings are non-reciprocal with some growing over time as the oscillators synchronize (phases at $t=T/2$ indicated as a black dot in the circle representing each oscillator). ( Fourth row) Phases of all oscillators (blue dot), target phase (red dot), and Kuramoto order parameter $re^{i\psi}$ (orange line, length = r, angle = $\psi$) at a time $t=T/2$ within each training iteration, showing that nearly all of them are synchronized by the end of training.
  • ...and 8 more figures