Table of Contents
Fetching ...

Blending Optimal Control and Biologically Plausible Learning for Noise-Robust Physical Neural Networks

Satoshi Sunada, Tomoaki Niiyama, Kazutaka Kanno, Rin Nogami, André Röhm, Takato Awano, Atsushi Uchida

TL;DR

A training approach that merges an optimal control method for continuous-time dynamical systems with a biologically plausible training method-direct feedback alignment achieves robust processing even under measurement errors and noise without requiring detailed system information.

Abstract

The rapidly increasing computational demands for artificial intelligence (AI) have spurred the exploration of computing principles beyond conventional digital computers. Physical neural networks (PNNs) offer efficient neuromorphic information processing by harnessing the innate computational power of physical processes; however, training their weight parameters is computationally expensive. We propose a training approach for substantially reducing this training cost. Our training approach merges an optimal control method for continuous-time dynamical systems with a biologically plausible training method--direct feedback alignment. In addition to the reduction of training time, this approach achieves robust processing even under measurement errors and noise without requiring detailed system information. The effectiveness was numerically and experimentally verified in an optoelectronic delay system. Our approach significantly extends the range of physical systems practically usable as PNNs.

Blending Optimal Control and Biologically Plausible Learning for Noise-Robust Physical Neural Networks

TL;DR

A training approach that merges an optimal control method for continuous-time dynamical systems with a biologically plausible training method-direct feedback alignment achieves robust processing even under measurement errors and noise without requiring detailed system information.

Abstract

The rapidly increasing computational demands for artificial intelligence (AI) have spurred the exploration of computing principles beyond conventional digital computers. Physical neural networks (PNNs) offer efficient neuromorphic information processing by harnessing the innate computational power of physical processes; however, training their weight parameters is computationally expensive. We propose a training approach for substantially reducing this training cost. Our training approach merges an optimal control method for continuous-time dynamical systems with a biologically plausible training method--direct feedback alignment. In addition to the reduction of training time, this approach achieves robust processing even under measurement errors and noise without requiring detailed system information. The effectiveness was numerically and experimentally verified in an optoelectronic delay system. Our approach significantly extends the range of physical systems practically usable as PNNs.

Paper Structure

This paper contains 6 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: (a) Schematic of physical neural networks (PNNs) with adjustable internal parameters for weight control. (b) Physical system driven by external control signals. (c) Proposed approach based on external control.
  • Figure 2: (a) $r(t)$, $u_1(t)$, and $u_2(t)$ before and after training. $u_1(t)$ and $u_2(t)$ before training were represented by black solid and black dashed lines, respectively. (b) A typical learning curve obtained by a single run (black solid line with circles) and mean learning curve (blue solid line). (c) Cosine similarity in the update vectors between the DFA--adjoint and adjoint methods Supplementary. The shadow represents the standard deviation. (d) Average computational training time per image.
  • Figure 3: (a) $C$ and (b) classification accuracy as a function of $\sigma$. For comparison, the results for ELM (without control) are presented. (c) Accuracy as a function of $\sigma$ in the system trained with a different noise strength $\sigma^*$. (d) Robustness to parameter mismatch error normalized using a true value, $(p - p_0)/p_0$, where $p$ and $p_0$ represent the parameter value used for training and a true value for $\beta$, $\tau_L$, or $\tau_H$.
  • Figure 4: (a) Experimental setup for an optoelectronic delay system. LD: semiconductor laser diode, ISO: optical isolator, MZM: Mach-Zehnder modulator, AMP: electrical amplifier, ATT: optical attenuator, PD: photodetector, OSC: digital oscilloscope, AWG: arbitrary waveform generator. (b) Learning curve.