Table of Contents
Fetching ...

Synaptic metaplasticity with multi-level memristive devices

Simone D'Agostino, Filippo Moro, Tifenn Hirtzlin, Julien Arcamone, Niccolò Castellani, Damien Querlioz, Melika Payvand, Elisa Vianello

TL;DR

Addresses catastrophic forgetting in sequential task learning by extending metaplasticity to quantized neural networks (QNNs) and implementing it with memristor-based in-memory computing. The paper introduces a metaplastic update rule that uses a meta-function $\mathcal{M}(W_H)$, defined over intervals $\mathcal{I}^D_i=\mathcal{Q}_{i+1}-\mathcal{Q}_i$ with $m^*$ controlling consolidation, and demonstrates that $m^* \in [2,4]$ yields robust two-task learning on MNIST and Fashion-MNIST. Hardware validation couples a memristor crossbar storing weights as conductance levels with a digital unit for high-precision metaplastic storage, using a mixed analog/digital architecture and a 16 kbit 1T1R hafnium oxide array. The experiments show near software-baseline accuracy on both tasks (e.g., MNIST ~97.5%, Fashion-MNIST ~86%), with large memory savings (approximately $15\times$) and favorable endurance characteristics, supporting on-chip continual learning for edge devices.

Abstract

Deep learning has made remarkable progress in various tasks, surpassing human performance in some cases. However, one drawback of neural networks is catastrophic forgetting, where a network trained on one task forgets the solution when learning a new one. To address this issue, recent works have proposed solutions based on Binarized Neural Networks (BNNs) incorporating metaplasticity. In this work, we extend this solution to quantized neural networks (QNNs) and present a memristor-based hardware solution for implementing metaplasticity during both inference and training. We propose a hardware architecture that integrates quantized weights in memristor devices programmed in an analog multi-level fashion with a digital processing unit for high-precision metaplastic storage. We validated our approach using a combined software framework and memristor based crossbar array for in-memory computing fabricated in 130 nm CMOS technology. Our experimental results show that a two-layer perceptron achieves 97% and 86% accuracy on consecutive training of MNIST and Fashion-MNIST, equal to software baseline. This result demonstrates immunity to catastrophic forgetting and the resilience to analog device imperfections of the proposed solution. Moreover, our architecture is compatible with the memristor limited endurance and has a 15x reduction in memory

Synaptic metaplasticity with multi-level memristive devices

TL;DR

Addresses catastrophic forgetting in sequential task learning by extending metaplasticity to quantized neural networks (QNNs) and implementing it with memristor-based in-memory computing. The paper introduces a metaplastic update rule that uses a meta-function , defined over intervals with controlling consolidation, and demonstrates that yields robust two-task learning on MNIST and Fashion-MNIST. Hardware validation couples a memristor crossbar storing weights as conductance levels with a digital unit for high-precision metaplastic storage, using a mixed analog/digital architecture and a 16 kbit 1T1R hafnium oxide array. The experiments show near software-baseline accuracy on both tasks (e.g., MNIST ~97.5%, Fashion-MNIST ~86%), with large memory savings (approximately ) and favorable endurance characteristics, supporting on-chip continual learning for edge devices.

Abstract

Deep learning has made remarkable progress in various tasks, surpassing human performance in some cases. However, one drawback of neural networks is catastrophic forgetting, where a network trained on one task forgets the solution when learning a new one. To address this issue, recent works have proposed solutions based on Binarized Neural Networks (BNNs) incorporating metaplasticity. In this work, we extend this solution to quantized neural networks (QNNs) and present a memristor-based hardware solution for implementing metaplasticity during both inference and training. We propose a hardware architecture that integrates quantized weights in memristor devices programmed in an analog multi-level fashion with a digital processing unit for high-precision metaplastic storage. We validated our approach using a combined software framework and memristor based crossbar array for in-memory computing fabricated in 130 nm CMOS technology. Our experimental results show that a two-layer perceptron achieves 97% and 86% accuracy on consecutive training of MNIST and Fashion-MNIST, equal to software baseline. This result demonstrates immunity to catastrophic forgetting and the resilience to analog device imperfections of the proposed solution. Moreover, our architecture is compatible with the memristor limited endurance and has a 15x reduction in memory
Paper Structure (4 sections, 2 equations, 5 figures, 1 table, 1 algorithm)

This paper contains 4 sections, 2 equations, 5 figures, 1 table, 1 algorithm.

Figures (5)

  • Figure 1: a The "catastrophic forgetting" problem: a network is trained sequentially with two different training sets (here MNIST and Fashion-MNIST). When learning Fashion-MNIST, the MNIST test accuracy collapses almost to random guessing. b The black arrows depict the paths inside the parameter space when using a classic learning sequence of MNIST and Fashion-MNIST, while the green arrows show the paths traversed using the mataplasticity training. c Metaplastic function $\mathcal{M}$ on a set of unequally spaced $\mathcal{Q}$ levels; it is used to implement metaplasticity on QNN by modulating the weights updates.
  • Figure 2: a Impact of the $m^*$ factor on the MNIST and Fashion-MNIST sequential learning: accuracy plot after $50$ epochs. b Sequential learning of MNIST and Fashion-MNIST with $m^*=2.2$. Comparison with a BNN with two hidden layers of $4096\,$neurons each. c Sequential learning of MNIST and Fashion-MNIST with $m^*=3$. Baseline: BNN with two hidden layers of $4096\,$neurons each.
  • Figure 3: Schematic of the on-chip mixed analog/digital learning architecture. The analog in-memory computing block has several crossbar arrays programmed in a multi-level fashion to store the hidden wights. This analog block performs the Forward (blue) and Backward (red) propagations. The resulting values are used to compute the hidden weights updates stored in the digital memory. The memristors conductance values are updated accordingly.
  • Figure 4: a 1T1R hafnium-based memristive device in a SEM image with highlighted memristor (blue). b Cumulative Density Function of 8-Level HCS, programmed with different $I_{CC}$ programming current values, and LCS.
  • Figure 5: a Sequential learning of MNIST and Fashion-MNIST in hybrid software/hardware experiment with 15 repetitions. Comparison with a BNN with two hidden layers of $4096\,$neurons each. b Percentage of devices as a function of the number of programming operations after training on MNIST and Fashion-MNIST.