Table of Contents
Fetching ...

Memristive tabular variational autoencoder for compression of analog data in high energy physics

Rajat Gupta, Yuvaraj Elangovan, Tae Min Hong, James Ignowski, John Moon, Aishwarya Natarajan, Stephen Roche, Luca Buonanno

TL;DR

An implementation of edge AI to compress data on an in-memory analog content-addressable memory (ACAM) device using the Structural Simulation Toolkit, the SST open source framework, gives a latency value of 24 ns and a throughput of 330M compressions per second.

Abstract

We present an implementation of edge AI to compress data on an in-memory analog content-addressable memory (ACAM) device. A variational autoencoder is trained on a simulated sample of energy measurements from incident high-energy electrons on a generic three-layer scintillator-based calorimeter. The encoding part is distilled into tabular format by regressing the latent space variables using decision trees, which is then programmed on a memristor-based ACAM. In real-time, the ACAM compresses 48 continuously valued incoming energies measured by the calorimeter sensors into the latent space, achieving a compression factor of 12x, which is transmitted off-detector for decompression. The performance result of the ACAM, obtained using the Structural Simulation Toolkit, the SST open source framework, gives a latency value of 24 ns and a throughput of 330M compressions per second, i.e., 3 ns between successive inputs, and an average energy consumption of 4.1 nJ per compression.

Memristive tabular variational autoencoder for compression of analog data in high energy physics

TL;DR

An implementation of edge AI to compress data on an in-memory analog content-addressable memory (ACAM) device using the Structural Simulation Toolkit, the SST open source framework, gives a latency value of 24 ns and a throughput of 330M compressions per second.

Abstract

We present an implementation of edge AI to compress data on an in-memory analog content-addressable memory (ACAM) device. A variational autoencoder is trained on a simulated sample of energy measurements from incident high-energy electrons on a generic three-layer scintillator-based calorimeter. The encoding part is distilled into tabular format by regressing the latent space variables using decision trees, which is then programmed on a memristor-based ACAM. In real-time, the ACAM compresses 48 continuously valued incoming energies measured by the calorimeter sensors into the latent space, achieving a compression factor of 12x, which is transmitted off-detector for decompression. The performance result of the ACAM, obtained using the Structural Simulation Toolkit, the SST open source framework, gives a latency value of 24 ns and a throughput of 330M compressions per second, i.e., 3 ns between successive inputs, and an average energy consumption of 4.1 nJ per compression.
Paper Structure (21 sections, 6 equations, 11 figures, 2 tables)

This paper contains 21 sections, 6 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Schematic of an autoencoder for data compression that is distilled into tabular format. Top: The dataflow starts with an incident electron, here with $83\,$GeV of energy, traversing a three-layer calorimeter. The energy deposits are projected onto the transverse planes, which are then simplified by grouping energies of nearby sensor elements, which serves as input to the tabular AE. Bottom: Close-up of a memristor-based analog content-addressable memory shows the crossbar structure of the input data ($\mathbf{x}$) crossing the match line to read-out into static RAM. Further close-up at each crossbar shows the memristor circuit architecture to produce a binary output. The latent data is transmitted and decompressed.
  • Figure 1: Residual distributions of the regressed latent variables for the scenario presented in the paper. Shown are the per-component differences between VAE and BDT. The vertical dashed line is at zero.
  • Figure 2: Physics observables before and after compression for the original electromagnetic showers (grey), the VAE encoder ($z=\mu$, gold), the BDT–regressed encoder ($z=\hat{\mu}$, blue line). See text.
  • Figure 2: Correlation matrix between the VAE and BDT for the scenario presented in the paper. Pearson correlation coefficients are shown in each box. The diagonal values for VAE-BDT are all above $0.9$; the off-diagonal values for VAE-BDT are similar to the corresponding values for VAE-VAE and BDT-BDT.
  • Figure 3: Latency (top-left), throughput (top-right), total area (bottom-left) and energy per compression (bottom-right) of the encoding part simulated on the ACAM-based architecture. Computation latency at about 10 ns is due to the ACAM alone. See Discussion for the comparison to FPGA in the bottom-right plot.
  • ...and 6 more figures