Table of Contents
Fetching ...

Energy-Efficient Sampling Using Stochastic Magnetic Tunnel Junctions

Nicolas Alder, Shivam Nitin Kajale, Milin Tunsiricharoengul, Deblina Sarkar, Ralf Herbrich

TL;DR

This work tackles the energy burden of probabilistic ML by introducing a hardware-software framework that uses room-temperature stochastic MTJs (s-MTJs) to generate true randomness and map outputs directly to Float16 bit positions for uniform sampling. It provides a closed-form, bit-level parameterization to realize Uniform(Float16) sampling without symbolic computation, and extends this with a mixture-of-uniforms representation to sample from arbitrary 1D distributions via convolution and prior-likelihood transforms. Empirical results report substantial energy savings—exceeding $9721\times$ vs Mersenne-Twister and $5649\times$ vs PCG—alongside quantifiable approximation errors (KL divergences) that remain small for convolution and prior-likelihood operations. The approach enables scalable uncertainty quantification in probabilistic ML and offers a path toward hardware-accelerated sampling, with future work focusing on prototype development and robustness across hardware variations.

Abstract

(Pseudo)random sampling, a costly yet widely used method in (probabilistic) machine learning and Markov Chain Monte Carlo algorithms, remains unfeasible on a truly large scale due to unmet computational requirements. We introduce an energy-efficient algorithm for uniform Float16 sampling, utilizing a room-temperature stochastic magnetic tunnel junction device to generate truly random floating-point numbers. By avoiding expensive symbolic computation and mapping physical phenomena directly to the statistical properties of the floating-point format and uniform distribution, our approach achieves a higher level of energy efficiency than the state-of-the-art Mersenne-Twister algorithm by a minimum factor of 9721 and an improvement factor of 5649 compared to the more energy-efficient PCG algorithm. Building on this sampling technique and hardware framework, we decompose arbitrary distributions into many non-overlapping approximative uniform distributions along with convolution and prior-likelihood operations, which allows us to sample from any 1D distribution without closed-form solutions. We provide measurements of the potential accumulated approximation errors, demonstrating the effectiveness of our method.

Energy-Efficient Sampling Using Stochastic Magnetic Tunnel Junctions

TL;DR

This work tackles the energy burden of probabilistic ML by introducing a hardware-software framework that uses room-temperature stochastic MTJs (s-MTJs) to generate true randomness and map outputs directly to Float16 bit positions for uniform sampling. It provides a closed-form, bit-level parameterization to realize Uniform(Float16) sampling without symbolic computation, and extends this with a mixture-of-uniforms representation to sample from arbitrary 1D distributions via convolution and prior-likelihood transforms. Empirical results report substantial energy savings—exceeding vs Mersenne-Twister and vs PCG—alongside quantifiable approximation errors (KL divergences) that remain small for convolution and prior-likelihood operations. The approach enables scalable uncertainty quantification in probabilistic ML and offers a path toward hardware-accelerated sampling, with future work focusing on prototype development and robustness across hardware variations.

Abstract

(Pseudo)random sampling, a costly yet widely used method in (probabilistic) machine learning and Markov Chain Monte Carlo algorithms, remains unfeasible on a truly large scale due to unmet computational requirements. We introduce an energy-efficient algorithm for uniform Float16 sampling, utilizing a room-temperature stochastic magnetic tunnel junction device to generate truly random floating-point numbers. By avoiding expensive symbolic computation and mapping physical phenomena directly to the statistical properties of the floating-point format and uniform distribution, our approach achieves a higher level of energy efficiency than the state-of-the-art Mersenne-Twister algorithm by a minimum factor of 9721 and an improvement factor of 5649 compared to the more energy-efficient PCG algorithm. Building on this sampling technique and hardware framework, we decompose arbitrary distributions into many non-overlapping approximative uniform distributions along with convolution and prior-likelihood operations, which allows us to sample from any 1D distribution without closed-form solutions. We provide measurements of the potential accumulated approximation errors, demonstrating the effectiveness of our method.
Paper Structure (20 sections, 18 equations, 16 figures, 3 tables, 1 algorithm)

This paper contains 20 sections, 18 equations, 16 figures, 3 tables, 1 algorithm.

Figures (16)

  • Figure 1: Hardware setup for sampling one value from a uniform Float16 distribution.
  • Figure 2: Possible Bernoulli resolutions for s-MTJ device with 4 control bits.
  • Figure 3: Physical approximation error comparison for the first three moments of the uniform distribution (s-MTJ-based approach vs. closed-form solution sampling). Second moment standard deviation omitted due to equivalence to the means.
  • Figure 4: (a) Schematic illustration of the self-energy ($E$) of a nanomagnet with respect to the polar angle ($\theta_M$) of its magnetization (indicated by thick arrows). (b) Natural frequency of stochastic switching for a nanomagnet of a particular diameter at different temperatures.
  • Figure 5: Dynamics of the normalized resistance of a stochastic MTJ for different bias current densities. (a) $I_{\text{bias}}$ = 0 produces equal probability of observing the high or low state. (b) Histogram of the observed resistance state for $I_{\text{bias}}$ = 0. (c, d) Trace and histogram of the observed resistance for a bias current of 2 × 10$^{11}$ A/m$^{2}$. (e, f) Trace and histogram of the observed resistance for a bias current of -2 × 10$^{11}$ A/m$^{2}$.
  • ...and 11 more figures