Synaptic metaplasticity with multi-level memristive devices
Simone D'Agostino, Filippo Moro, Tifenn Hirtzlin, Julien Arcamone, Niccolò Castellani, Damien Querlioz, Melika Payvand, Elisa Vianello
TL;DR
Addresses catastrophic forgetting in sequential task learning by extending metaplasticity to quantized neural networks (QNNs) and implementing it with memristor-based in-memory computing. The paper introduces a metaplastic update rule that uses a meta-function $\mathcal{M}(W_H)$, defined over intervals $\mathcal{I}^D_i=\mathcal{Q}_{i+1}-\mathcal{Q}_i$ with $m^*$ controlling consolidation, and demonstrates that $m^* \in [2,4]$ yields robust two-task learning on MNIST and Fashion-MNIST. Hardware validation couples a memristor crossbar storing weights as conductance levels with a digital unit for high-precision metaplastic storage, using a mixed analog/digital architecture and a 16 kbit 1T1R hafnium oxide array. The experiments show near software-baseline accuracy on both tasks (e.g., MNIST ~97.5%, Fashion-MNIST ~86%), with large memory savings (approximately $15\times$) and favorable endurance characteristics, supporting on-chip continual learning for edge devices.
Abstract
Deep learning has made remarkable progress in various tasks, surpassing human performance in some cases. However, one drawback of neural networks is catastrophic forgetting, where a network trained on one task forgets the solution when learning a new one. To address this issue, recent works have proposed solutions based on Binarized Neural Networks (BNNs) incorporating metaplasticity. In this work, we extend this solution to quantized neural networks (QNNs) and present a memristor-based hardware solution for implementing metaplasticity during both inference and training. We propose a hardware architecture that integrates quantized weights in memristor devices programmed in an analog multi-level fashion with a digital processing unit for high-precision metaplastic storage. We validated our approach using a combined software framework and memristor based crossbar array for in-memory computing fabricated in 130 nm CMOS technology. Our experimental results show that a two-layer perceptron achieves 97% and 86% accuracy on consecutive training of MNIST and Fashion-MNIST, equal to software baseline. This result demonstrates immunity to catastrophic forgetting and the resilience to analog device imperfections of the proposed solution. Moreover, our architecture is compatible with the memristor limited endurance and has a 15x reduction in memory
