Table of Contents
Fetching ...

Online Training and Inference System on Edge FPGA Using Delayed Feedback Reservoir

Sosei Ikeda, Hiromitsu Awano, Takashi Sato

TL;DR

The paper tackles the challenge of online training for delayed feedback reservoirs (DFRs) on edge hardware by introducing a modular DFR that supports backpropagation-based reservoir optimization, a truncated backpropagation scheme to curb memory use, and an in-place Ridge regression method via 1-D Cholesky decomposition. These innovations collectively enable real-time online training and inference on FPGA, achieving substantial gains in speed (up to ~1/13 of software time) and energy (up to ~1/27) while maintaining accuracy comparable to grid-search baselines. A fast parameter optimization method reduces the previously prohibitive grid-search cost by about 700×, and the 1-D Cholesky-based Ridge regression reduces memory by roughly 75% without sacrificing performance. The work demonstrates the viability of end-to-end online edge processing for time-series tasks using DFRs, with clear implications for predictive maintenance and other real-time edge applications.

Abstract

A delayed feedback reservoir (DFR) is a hardwarefriendly reservoir computing system. Implementing DFRs in embedded hardware requires efficient online training. However, two main challenges prevent this: hyperparameter selection, which is typically done by offline grid search, and training of the output linear layer, which is memory-intensive. This paper introduces a fast and accurate parameter optimization method for the reservoir layer utilizing backpropagation and gradient descent by adopting a modular DFR model. A truncated backpropagation strategy is proposed to reduce memory consumption associated with the expansion of the recursive structure while maintaining accuracy. The computation time is significantly reduced compared to grid search. Additionally, an in-place Ridge regression for the output layer via 1-D Cholesky decomposition is presented, reducing memory usage to be 1/4. These methods enable the realization of an online edge training and inference system of DFR on an FPGA, reducing computation time by about 1/13 and power consumption by about 1/27 compared to software implementation on the same board.

Online Training and Inference System on Edge FPGA Using Delayed Feedback Reservoir

TL;DR

The paper tackles the challenge of online training for delayed feedback reservoirs (DFRs) on edge hardware by introducing a modular DFR that supports backpropagation-based reservoir optimization, a truncated backpropagation scheme to curb memory use, and an in-place Ridge regression method via 1-D Cholesky decomposition. These innovations collectively enable real-time online training and inference on FPGA, achieving substantial gains in speed (up to ~1/13 of software time) and energy (up to ~1/27) while maintaining accuracy comparable to grid-search baselines. A fast parameter optimization method reduces the previously prohibitive grid-search cost by about 700×, and the 1-D Cholesky-based Ridge regression reduces memory by roughly 75% without sacrificing performance. The work demonstrates the viability of end-to-end online edge processing for time-series tasks using DFRs, with clear implications for predictive maintenance and other real-time edge applications.

Abstract

A delayed feedback reservoir (DFR) is a hardwarefriendly reservoir computing system. Implementing DFRs in embedded hardware requires efficient online training. However, two main challenges prevent this: hyperparameter selection, which is typically done by offline grid search, and training of the output linear layer, which is memory-intensive. This paper introduces a fast and accurate parameter optimization method for the reservoir layer utilizing backpropagation and gradient descent by adopting a modular DFR model. A truncated backpropagation strategy is proposed to reduce memory consumption associated with the expansion of the recursive structure while maintaining accuracy. The computation time is significantly reduced compared to grid search. Additionally, an in-place Ridge regression for the output layer via 1-D Cholesky decomposition is presented, reducing memory usage to be 1/4. These methods enable the realization of an online edge training and inference system of DFR on an FPGA, reducing computation time by about 1/13 and power consumption by about 1/27 compared to software implementation on the same board.

Paper Structure

This paper contains 19 sections, 32 equations, 15 figures, 12 tables, 5 algorithms.

Figures (15)

  • Figure 1: Conceptual diagram of DFR. The reservoir consists of a NL and a feedback loop with a total delay $\tau$. The feedback loop comprises $N_x$ virtual nodes with a time interval $\theta$.
  • Figure 2: Masking process. $i$ is obtained by digital-to-analog conversion of input signal $u$. Signal $i$ is constant at all $\tau$. Mask signal $m$ takes different values at all time intervals $\theta$, and its period is $\tau$. Input signal to reservoir is expressed as $j(t) = i(t) \cdot m(t)$.
  • Figure 3: Block diagram of reservoir processing in the modular DFR model. The block labeled "$f$" operates as a one-input, one-output function $f$ in Eq. (\ref{['eq:MG']}). Only two parameters, $p$ and $q$, have to be optimized.
  • Figure 4: Computation graph of forward and backward propagation in the output layer.
  • Figure 5: Computation graph of forward and backward propagation in the DPRR layer.
  • ...and 10 more figures