Table of Contents
Fetching ...

Synergistic Development of Perovskite Memristors and Algorithms for Robust Analog Computing

Nanyang Ye, Qiao Sun, Yifei Wang, Liujia Yang, Jundong Zhou, Lei Wang, Guang-Zhong Yang, Xinbing Wang, Chenghu Zhou, Wei Ren, Leilei Gu, Huaqiang Wu, Qinying Gu

TL;DR

An integrated approach enables use of analog computing in much deeper and wider networks, which significantly outperforms existing methods in diverse tasks like image classification, autonomous driving, species identification, and large vision-language models, achieving up to 100-fold improvements.

Abstract

Analog computing using non-volatile memristors has emerged as a promising solution for energy-efficient deep learning. New materials, like perovskites-based memristors are recently attractive due to their cost-effectiveness, energy efficiency and flexibility. Yet, challenges in material diversity and immature fabrications require extensive experimentation for device development. Moreover, significant non-idealities in these memristors often impede them for computing. Here, we propose a synergistic methodology to concurrently optimize perovskite memristor fabrication and develop robust analog DNNs that effectively address the inherent non-idealities of these memristors. Employing Bayesian optimization (BO) with a focus on usability, we efficiently identify optimal materials and fabrication conditions for perovskite memristors. Meanwhile, we developed "BayesMulti", a DNN training strategy utilizing BO-guided noise injection to improve the resistance of analog DNNs to memristor imperfections. Our approach theoretically ensures that within a certain range of parameter perturbations due to memristor non-idealities, the prediction outcomes remain consistent. Our integrated approach enables use of analog computing in much deeper and wider networks, which significantly outperforms existing methods in diverse tasks like image classification, autonomous driving, species identification, and large vision-language models, achieving up to 100-fold improvements. We further validate our methodology on a 10$\times$10 optimized perovskite memristor crossbar, demonstrating high accuracy in a classification task and low energy consumption. This study offers a versatile solution for efficient optimization of various analog computing systems, encompassing both devices and algorithms.

Synergistic Development of Perovskite Memristors and Algorithms for Robust Analog Computing

TL;DR

An integrated approach enables use of analog computing in much deeper and wider networks, which significantly outperforms existing methods in diverse tasks like image classification, autonomous driving, species identification, and large vision-language models, achieving up to 100-fold improvements.

Abstract

Analog computing using non-volatile memristors has emerged as a promising solution for energy-efficient deep learning. New materials, like perovskites-based memristors are recently attractive due to their cost-effectiveness, energy efficiency and flexibility. Yet, challenges in material diversity and immature fabrications require extensive experimentation for device development. Moreover, significant non-idealities in these memristors often impede them for computing. Here, we propose a synergistic methodology to concurrently optimize perovskite memristor fabrication and develop robust analog DNNs that effectively address the inherent non-idealities of these memristors. Employing Bayesian optimization (BO) with a focus on usability, we efficiently identify optimal materials and fabrication conditions for perovskite memristors. Meanwhile, we developed "BayesMulti", a DNN training strategy utilizing BO-guided noise injection to improve the resistance of analog DNNs to memristor imperfections. Our approach theoretically ensures that within a certain range of parameter perturbations due to memristor non-idealities, the prediction outcomes remain consistent. Our integrated approach enables use of analog computing in much deeper and wider networks, which significantly outperforms existing methods in diverse tasks like image classification, autonomous driving, species identification, and large vision-language models, achieving up to 100-fold improvements. We further validate our methodology on a 1010 optimized perovskite memristor crossbar, demonstrating high accuracy in a classification task and low energy consumption. This study offers a versatile solution for efficient optimization of various analog computing systems, encompassing both devices and algorithms.

Paper Structure

This paper contains 2 sections, 3 theorems, 28 equations, 17 figures.

Key Result

Theorem 1

(Robustness guarantee of noise injection). Given the analog DNN model $f$ trained with induced multinomial noises $\pi_{0}$ following the PDF in Equation eq:multinomial, $f_{\pi_{0}}(\theta_{0}):= \mathbb{E}_{\eta \sim \pi_{0}}[f(\theta_{0}*\eta)]$, where $\theta_{0}$ is the DNN's parameters. The ma where $\Theta$ is the dimension of $\theta_{0}$, i.e., the number of parameters .

Figures (17)

  • Figure 1: (a) Illustration of the Bayesian fabrication optimization process. (b) SEM images from a good-performed memristor, poorly-performed memristor with incomplete(c), discontinued (d), and excessive growth of perovskite (e). (f) A uniform and completely grown of perovskite inside nanowires ensures the formation of Ag filament while in other irregular cases, filament formation and rupture are interfered. (g) Schematic showing differences in the number of electron movement pathways in NWs and QWs.
  • Figure 2: (a) Experimental results on the MNIST dataset. Left: a schematic demonstration of the task. Right: the curve charts compare the prediction accuracy of BayesMulti and ERM at different usability levels on MLP and LeNet. (b) Experimental results on the CIFAR-10 dataset. Left: a schematic demonstration of the task. Right: the curve charts compare the prediction accuracy of BayesMulti and ERM at usability levels on AlexNet, MobileNet, ResNext, and ResNet. Each method was run 10 times, and the mean (dot) and standard deviation (shaded areas) of accuracy under different usability levels are recorded and demonstrated in the curve charts.
  • Figure 3: (a) The experimental setting of the object detection task on KITTI. Three types of objects were detected: cars, pedestrians, and cyclists. (b) Detection accuracy of BayesMulti and ERM on all KITTI dataset subsets (Easy, Moderate, and Hard) together with the average performance. Each method was run 10 times, and the mean (dot) and standard deviation (shaded areas) of accuracy under different usability levels are recorded and demonstrated in the curve charts. (c) Visualization of object detection results. The top figure is the 3D and BEV of the ground truth detection result. The left bottom figure is BayesMulti's result and the bottom right figure is ERM’s result under different analog noise levels.
  • Figure 4: (a) A schematic demonstration of the task of predicting neutralization effects of Abs, with BayesMulti applied to Mason's CNN. (b) The accuracy, AUC, and MCC of BayesMulti and ERM at different hardware non-idealities (usability levels) on the HIV dataset and (c) the CoVAbDab SARS-CoV-2 dataset. (d) A schematic demonstration of species prediction task by glycans' representations, with BayesMulti applied to the SweetNet. (e) Taxonomic glycan representations learned by SweetNet trained with ERM and BayesMulti. Glycan representations. These representations are shown via t-SNE and are colored by their taxonomic kingdom. (f) The accuracy, F1 score, and ROC of BayesMulti and ERM at different hardware non-idealities (usability values) on the species prediction dataset. Each method was run 10 times, and the mean (dot) and standard deviation (shaded areas) of accuracy under different usability values are recorded and demonstrated in the curve charts.
  • Figure 5: (a) A schematic demonstration of the structure of miniGPT-4, with BayesMulti applied to the linear layers. (b) The performance of BayesMulti and ERM at different hardware non-idealities ($\sigma$ values) on a CIFAR-10 classification task, an MSCOCO object counting task, and an MSCOCO visual question answering task. Each method was run 30 times, and the mean (dot) and standard deviation (shaded areas) of accuracy under different usability levels are recorded and demonstrated in the curve charts. (c) The performance of practical dialogue task of ERM-equipped miniGPT-4 and BayesMulti-equipped miniGPT-4 under different hardware non-idealities (i.e. usability=0.74, 0.69, 0.57). ERM fails to generate correct image descriptions from 0.69, while BayesMulti yields coherent answers aligned with the visual content under all three cases.
  • ...and 12 more figures

Theorems & Definitions (4)

  • Theorem 1
  • Theorem 2
  • Lemma 1
  • proof : Proof of Lemma 3