Modeling Memristor-Based Neural Networks with Manhattan Update: Trade-offs in Learning Performance and Energy Consumption
Walter Quiñonez, María José Sánchez, Diego Rubi
TL;DR
This work tackles online training of memristor-based neural networks using the hardware-friendly Manhattan update, under realistic device non-idealities such as nonlinear potentiation/depression (P/D) curves, finite conductance windows, and limited multilevel resolution. By simulating SP and DNN architectures trained on MNIST, the authors quantify how a non-linearity index $NLI$, conductance range, and level count $L$ affect convergence and accuracy, finding SP tolerates $NLI ≤ 10^-2$ and DNN tolerates $NLI ≤ 10^-3$, with accuracy improving as $L$ increases. A key contribution is the G_fix strategy, fixing one memristor in each differential pair to cut training energy by up to about 45% in DNN (and ~20% in SP) with minimal accuracy loss, demonstrating effective device–algorithm co-design. Overall, the results show that Manhattan-rule-based memristive learning can achieve scalable, low-power online training suitable for edge AI, provided careful control of non-idealities and conductance budgets.
Abstract
We present a systematic study of memristor based neural networks trained with the hardware-friendly Manhattan update rule, focusing on the trade offs between learning performance and energy consumption. Using realistic models of potentiation/depression (P/D) curves, we evaluate the impact of nonlinearity (NLI), conductance range, and number of accessible levels on both a single perceptron (SP) and a deep neural network (DNN) trained on the MNIST dataset. Our results show that SPs tolerate P/D nonlinearity up to NLI $\leq 0.01$, while DNNs require stricter conditions of NLI $\leq$ 0.001 to preserve accuracy. Increasing the number of discrete conductance states improves convergence, effectively acting as a finer learning rate. We further propose a strategy where one memristor of each differential pair is fixed, reducing redundant memristor conductance updates. This approach lowers training energy by nearly 50% in DNN with little to no loss in accuracy. Our findings highlight the importance of device algorithm codesign in enabling scalable, low power neuromorphic hardware for edge AI applications.
