Table of Contents
Fetching ...

Deep Photonic Reservoir Computer for Speech Recognition

Enrico Picco, Alessandro Lupo, Serge Massar

TL;DR

The paper addresses the performance gap between energy-efficient reservoir computing and deep neural networks by implementing photonic deep reservoir computing (DRC) for speech recognition. It employs two experimental configurations: a deep stack of delay-based reservoirs and a two-layer setup with CMA-ES-optimized interlayer connections, with hyperparameters tuned via Bayesian optimization. The approach is validated on spoken digits and Japanese vowels, showing that deep architectures outperform shallow RC and that optimized interlayer connections yield the best accuracy, while enabling real-time processing on photonic hardware. This work demonstrates the viability of low-power, high-speed neuromorphic photonic hardware for practical speech recognition and offers design guidelines for future DRC systems.

Abstract

Speech recognition is a critical task in the field of artificial intelligence and has witnessed remarkable advancements thanks to large and complex neural networks, whose training process typically requires massive amounts of labeled data and computationally intensive operations. An alternative paradigm, reservoir computing, is energy efficient and is well adapted to implementation in physical substrates, but exhibits limitations in performance when compared to more resource-intensive machine learning algorithms. In this work we address this challenge by investigating different architectures of interconnected reservoirs, all falling under the umbrella of deep reservoir computing. We propose a photonic-based deep reservoir computer and evaluate its effectiveness on different speech recognition tasks. We show specific design choices that aim to simplify the practical implementation of a reservoir computer while simultaneously achieving high-speed processing of high-dimensional audio signals. Overall, with the present work we hope to help the advancement of low-power and high-performance neuromorphic hardware.

Deep Photonic Reservoir Computer for Speech Recognition

TL;DR

The paper addresses the performance gap between energy-efficient reservoir computing and deep neural networks by implementing photonic deep reservoir computing (DRC) for speech recognition. It employs two experimental configurations: a deep stack of delay-based reservoirs and a two-layer setup with CMA-ES-optimized interlayer connections, with hyperparameters tuned via Bayesian optimization. The approach is validated on spoken digits and Japanese vowels, showing that deep architectures outperform shallow RC and that optimized interlayer connections yield the best accuracy, while enabling real-time processing on photonic hardware. This work demonstrates the viability of low-power, high-speed neuromorphic photonic hardware for practical speech recognition and offers design guidelines for future DRC systems.

Abstract

Speech recognition is a critical task in the field of artificial intelligence and has witnessed remarkable advancements thanks to large and complex neural networks, whose training process typically requires massive amounts of labeled data and computationally intensive operations. An alternative paradigm, reservoir computing, is energy efficient and is well adapted to implementation in physical substrates, but exhibits limitations in performance when compared to more resource-intensive machine learning algorithms. In this work we address this challenge by investigating different architectures of interconnected reservoirs, all falling under the umbrella of deep reservoir computing. We propose a photonic-based deep reservoir computer and evaluate its effectiveness on different speech recognition tasks. We show specific design choices that aim to simplify the practical implementation of a reservoir computer while simultaneously achieving high-speed processing of high-dimensional audio signals. Overall, with the present work we hope to help the advancement of low-power and high-performance neuromorphic hardware.
Paper Structure (16 sections, 6 equations, 8 figures)

This paper contains 16 sections, 6 equations, 8 figures.

Figures (8)

  • Figure 1: Architecture of a standard, or "shallow", reservoir computer. The fixed input and reservoir connections are represented with solid arrows, whereas the trained output connections with dashed arrows.
  • Figure 2: Architecture of a deep reservoir computer. Solid arrows represent fixed interconnections; dashed arrows represent trained interconnections. The grey solid arrows between the reservoir layers represent the random untrained interlayer connections, described by matrix $\mathbf{W}_l$.
  • Figure 3: Architecture of a deep reservoir computer where the interlayer mask $\mathbf{ W}_2$ is optimized by means of an Evolutionary Algorithm. In this configuration both the interlayer and output weights are trained (dashed arrows). The input and internal reservoir connections are fixed (solid arrows).
  • Figure 4: Architecture of a delay-based reservoir computer. The reservoir states are obtained by time-multiplexing the input signal using a periodic input mask (in purple) and a Non-Linear (NL) node. The rest of the scheme is similar to a standard reservoir computer, where only the output weights are trained.
  • Figure 5: The experimental optoelectronic system used in this work. The optic fiber connections are in orange, and the electronic ones in blue. LED: Superluminescent diode. MZM: Mach-Zender Intensity Modulator. PD: Photodetector. FPGA: Field Programmable Gate Array. Att: Optical Attenuator. Spool: 1.7km optic fiber spool.
  • ...and 3 more figures