Table of Contents
Fetching ...

Dynamic Symmetric Point Tracking: Tackling Non-ideal Reference in Analog In-memory Training

Quan Xiao, Jindan Li, Zhaoxian Wu, Tayfun Gokmen, Tianyi Chen

TL;DR

This work presents the first theoretical characterization of the pulse complexity of SP calibration and the resulting estimation error, and proposes a dynamic SP estimation method that tracks the SP during model training, and establishes its convergence guarantees.

Abstract

Analog in-memory computing (AIMC) performs computation directly within resistive crossbar arrays, offering an energy-efficient platform to scale large vision and language models. However, non-ideal analog device properties make the training on AIMC devices challenging. In particular, its update asymmetry can induce a systematic drift of weight updates towards a device-specific symmetric point (SP), which typically does not align with the optimum of the training objective. To mitigate this bias, most existing works assume the SP is known and pre-calibrate it to zero before training by setting the reference point as the SP. Nevertheless, calibrating AIMC devices requires costly pulse updates, and residual calibration error can directly degrade training accuracy. In this work, we present the first theoretical characterization of the pulse complexity of SP calibration and the resulting estimation error. We further propose a dynamic SP estimation method that tracks the SP during model training, and establishes its convergence guarantees. In addition, we develop an enhanced variant based on chopping and filtering techniques from digital signal processing. Numerical experiments demonstrate both the efficiency and effectiveness of the proposed method.

Dynamic Symmetric Point Tracking: Tackling Non-ideal Reference in Analog In-memory Training

TL;DR

This work presents the first theoretical characterization of the pulse complexity of SP calibration and the resulting estimation error, and proposes a dynamic SP estimation method that tracks the SP during model training, and establishes its convergence guarantees.

Abstract

Analog in-memory computing (AIMC) performs computation directly within resistive crossbar arrays, offering an energy-efficient platform to scale large vision and language models. However, non-ideal analog device properties make the training on AIMC devices challenging. In particular, its update asymmetry can induce a systematic drift of weight updates towards a device-specific symmetric point (SP), which typically does not align with the optimum of the training objective. To mitigate this bias, most existing works assume the SP is known and pre-calibrate it to zero before training by setting the reference point as the SP. Nevertheless, calibrating AIMC devices requires costly pulse updates, and residual calibration error can directly degrade training accuracy. In this work, we present the first theoretical characterization of the pulse complexity of SP calibration and the resulting estimation error. We further propose a dynamic SP estimation method that tracks the SP during model training, and establishes its convergence guarantees. In addition, we develop an enhanced variant based on chopping and filtering techniques from digital signal processing. Numerical experiments demonstrate both the efficiency and effectiveness of the proposed method.
Paper Structure (34 sections, 19 theorems, 119 equations, 5 figures, 6 tables, 4 algorithms)

This paper contains 34 sections, 19 theorems, 119 equations, 5 figures, 6 tables, 4 algorithms.

Key Result

Theorem 2.2

Considering the response functions in Definition assumption:response-factor, then the iterates given by Algorithm alg: ZS with $N$ pulses satisfy

Figures (5)

  • Figure 1: Trade-off between SP estimation accuracy and pulse cost for ZS algorithm. (a) For each $N$, we obtain per-cell SP estimates on a $512\times512$ array, and compute the mean and standard deviation across all cells. We plot the offsets of these statistics relative to the ground truth. (b) As $\Delta w_{\min}$ decreases, achieving a target accuracy (e.g., $\le 1\%$ relative mean error) needs substantially more pulses.
  • Figure 2: Training loss on MNIST (LeNet-5, TT-v1 gokmen2020) using ground-truth SP and SPs estimated with different numbers of pulses $N$ via zero-shifting Algorithm \ref{['alg: ZS']}.
  • Figure 3: Chopping and filtering via moving average.
  • Figure 4: (Left) total pulse cost to reach the target training loss $0.2$ on LeNet-5 (MNIST) across different number of states settings. Solid bars indicate the number of pulses using ZS algorithm, while hatched bars indicate the training cost computed as $\text{epochs}\times\lceil \text{data size}/B\rceil\times \mathrm{BL}$, with batch size $B=64$ and an average update pulse length $\mathrm{BL}=5$. For $2000$ states, ZS ($N=4000$) fails to reach the target loss. (Middle & right) training loss of E-RIDER and baselines under different reference std/mean on ResNet-18 (CIFAR-100) after 80 epochs.
  • Figure 5: Test accuracy of E-RIDER on MNIST-FCN after 50 epochs under different input chopper probabilities $p$.

Theorems & Definitions (34)

  • Definition 1.1: Symmetric point
  • Definition 2.1: Training-friendly response functions
  • Theorem 2.2: Convergence rate of Algorithm \ref{['alg: ZS']}
  • Remark 2.3
  • Lemma 3.5
  • Theorem 3.7: Convergence of Algorithm \ref{['alg: DSPT']}
  • Remark 3.8
  • Corollary 3.9: Overall pulse complexity
  • Lemma 3.10
  • Lemma 1.1: wu2025analog
  • ...and 24 more