Table of Contents
Fetching ...

Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model

Jichang Yang, Hegan Chen, Jia Chen, Songqi Wang, Shaocong Wang, Yifei Yu, Xi Chen, Bo Wang, Xinyuan Zhang, Binbin Cui, Yi Li, Ning Lin, Meng Xu, Yi Li, Xiaoxin Xu, Xiaojuan Qi, Zhongrui Wang, Xumeng Zhang, Dashan Shang, Han Wang, Qi Liu, Kwang-Ting Cheng, Ming Liu

TL;DR

Inspired by the brain, this work proposes a time-continuous and analog in-memory neural differential equation solver for score-based diffusion, employing emerging resistive memory, and achieves remarkable enhancements in generative speed for both unconditional and conditional generation tasks.

Abstract

Human brains image complicated scenes when reading a novel. Replicating this imagination is one of the ultimate goals of AI-Generated Content (AIGC). However, current AIGC methods, such as score-based diffusion, are still deficient in terms of rapidity and efficiency. This deficiency is rooted in the difference between the brain and digital computers. Digital computers have physically separated storage and processing units, resulting in frequent data transfers during iterative calculations, incurring large time and energy overheads. This issue is further intensified by the conversion of inherently continuous and analog generation dynamics, which can be formulated by neural differential equations, into discrete and digital operations. Inspired by the brain, we propose a time-continuous and analog in-memory neural differential equation solver for score-based diffusion, employing emerging resistive memory. The integration of storage and computation within resistive memory synapses surmount the von Neumann bottleneck, benefiting the generative speed and energy efficiency. The closed-loop feedback integrator is time-continuous, analog, and compact, physically implementing an infinite-depth neural network. Moreover, the software-hardware co-design is intrinsically robust to analog noise. We experimentally validate our solution with 180 nm resistive memory in-memory computing macros. Demonstrating equivalent generative quality to the software baseline, our system achieved remarkable enhancements in generative speed for both unconditional and conditional generation tasks, by factors of 64.8 and 156.5, respectively. Moreover, it accomplished reductions in energy consumption by factors of 5.2 and 4.1. Our approach heralds a new horizon for hardware solutions in edge computing for generative AI applications.

Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model

TL;DR

Inspired by the brain, this work proposes a time-continuous and analog in-memory neural differential equation solver for score-based diffusion, employing emerging resistive memory, and achieves remarkable enhancements in generative speed for both unconditional and conditional generation tasks.

Abstract

Human brains image complicated scenes when reading a novel. Replicating this imagination is one of the ultimate goals of AI-Generated Content (AIGC). However, current AIGC methods, such as score-based diffusion, are still deficient in terms of rapidity and efficiency. This deficiency is rooted in the difference between the brain and digital computers. Digital computers have physically separated storage and processing units, resulting in frequent data transfers during iterative calculations, incurring large time and energy overheads. This issue is further intensified by the conversion of inherently continuous and analog generation dynamics, which can be formulated by neural differential equations, into discrete and digital operations. Inspired by the brain, we propose a time-continuous and analog in-memory neural differential equation solver for score-based diffusion, employing emerging resistive memory. The integration of storage and computation within resistive memory synapses surmount the von Neumann bottleneck, benefiting the generative speed and energy efficiency. The closed-loop feedback integrator is time-continuous, analog, and compact, physically implementing an infinite-depth neural network. Moreover, the software-hardware co-design is intrinsically robust to analog noise. We experimentally validate our solution with 180 nm resistive memory in-memory computing macros. Demonstrating equivalent generative quality to the software baseline, our system achieved remarkable enhancements in generative speed for both unconditional and conditional generation tasks, by factors of 64.8 and 156.5, respectively. Moreover, it accomplished reductions in energy consumption by factors of 5.2 and 4.1. Our approach heralds a new horizon for hardware solutions in edge computing for generative AI applications.
Paper Structure (21 sections, 10 equations, 5 figures)

This paper contains 21 sections, 10 equations, 5 figures.

Figures (5)

  • Figure 1: Comparison of imagination model, computing architecture, and signal representation between the human brain, digital computer and our system.a-c, Imagination models. a, Human imagination. Human beings possess the innate ability to imagine (e.g. blue dove with olive branch) after learning relevant knowledge and experiences. b, Generative diffusion model. Both digital computer and our system samples from a standard Gaussian distribution and transforming it into samples from the target distribution through progressive transformation by a neural differential equation. c-e, Comparison of computing architectures. c, Biological neural network of the brain. Synapses store synaptic strength and modulate signal transmission between adjacent neurons according to the strength. d, Von Neumann architecture. Conventional digital computers feature physically separated storage and computing units, which suffers from high power consumption and low speed due to data shuttling. e, Resistive memory-based in-memory computing. Inspired by the brain, resistive memory cells emulate synaptic functionality by concurrently storing and processing information in an energy-efficient manner. f-g, Comparison of signal representations. f, Human brainwaves. The signals in the brain are time-continuous and analog. g, Signals of digital computers. Discretization and digitization introduce truncation and round-off errors. h, Signals of our system. Like the brain, our system operates with fully time-continuous and analog signals.
  • Figure 2: Resistive memory characterization and circuit design of neural differential equation solver.a-g, Physical and electrical characterization of resistive memory. a, Optical photos of the resistive memory array, cross-sectional of a single 1-transistor-1-resistor cell (scale bar: 1000nm), $\mathrm{TiN}/\mathrm{TaO}_x/\mathrm{Ta}_2\mathrm{O}_5/\mathrm{TiN}$ resistive memory cell (scale bar: 200nm) and the series 180nm transistor (scale bar: 500nm). b, line profile of a single resistive memory cell. c, Quasi static I-V sweeps of the a resistive memory cell, showing repeatable bipolar resistive switching behavior. d, More than 64 discernible linear conductance states of a resistive memory cell. e, More than $10^{6}$s retention of different conductance states of resistive memory cells. f, 32$\times$32 resistive memory array programmed to display a moon and star conductance pattern. g, Resistive memory arrays conductance error distribution at different times. h-k, Circuit blocks of the neural differential equation solver. h, Diagram of a single-layer analog neural network, where resistive memory cells form differential pairs with row-shared negative weights circuit. Input voltages are applied to the BLs of a resistive memory array, with resultant output currents traversing through SLs. The activation is physically implemented by and inverting amplifiers. i, The analog neural network cascades three single layer analog neural network modules. j, The output of the analog neural network is received by the feedback integrator circuit, consisting of an analog integrator, which in turn provides feedback to the input of the analog neural network, forming a closed-loop circuit to solve neural differential equations. k, In latent diffusion, the solution provided by the neural differential equation solver is translated into pixel space by resistive memory-based deconvolution decoder.
  • Figure 3: Experimental demonstration of unconditional circular distribution generation.a, Experimental voltage waveforms of the analog neural network during a single sampling. b, Histogram of the offline optimized analog neural network weights and the experimental weights programmed into the resistive memory arrays. c, Histogram of the input voltages for each layer of the analog neural network. d, Schematic illustration of the two-dimensional gradient vector field output of the analog neural network. The axes represent the input voltages to the neural network, while the arrows depict the output voltage vectors from the analog neural network, with the isopotential curves denoted in red. e, Time slices of the two-dimensional distribution of analog neural network input vector from 1000 times sampling, along with waveforms showing the example trajectories over time of sampling. Initial voltage vectors, drawn from a two-dimensional Gaussian distribution, evolve to achieve the intended circular distribution after a predefined resolution period. f, Comparison of the generation speed of our co-design with that of a state-of-the-art digital hardware, showing 64.8x improvement under the same generation quality. g, Comparison of the energy consumption of the analog and a state-of-the-art digital hardware, showing a 80.8% decrement.
  • Figure 4: Experimental demonstration of conditional generation of handwritten letters using latent diffusion.a, Software framework. An outer , trained on the dataset, encodes images into a two-dimensional latent space. After generation within the latent space via conditional score-based diffusion, the model's decoder transforms the latent vectors back into images. b, Condition embedding. Category labels are one-hot encoded and then transformed using random projection. The output, matching the intermediate layer dimension of the neural network, is summed with a time-encoded signal and fed into the analog neural network to guide the generation. c, Example feature maps of the decoder in mapping latent vectors back to pixel image. d, Experimental distribution of three categories of handwritten letters within the latent space, each category consists of 500 times sampling. e, Time evolution of the three conditional distributions in the two-dimensional latent space. f, Experimental voltage waveforms showing different diffusion trajectories from same initial voltage latent vector under different conditions, which are subsequently decoded into handwritten letters of the corresponding categories. g, Comparison of the generation speed between our co-design and a state-of-the-art digital hardware. h, Comparison of the energy consumption between our co-design and state-of-the-art digital hardware.
  • Figure 5: Resistive memory noises and their impact on generation performance.a, Physical mechanisms of write and read noise in resistive memory. b, Experimental write noise in resistive memory programming. c, Experimental read noise over different time scales of resistive memory. d, Illustration of noise injection in score-based diffusion using . e, Impact of various degrees of read noise and write noise on the generation quality. f, Respective impacts of various levels of write and read noise on the generation quality using and score-based diffusion.