Table of Contents
Fetching ...

Enhancing Biologically Inspired Hierarchical Temporal Memory with Hardware-Accelerated Reflex Memory

Pavia Bera, Sabrina Hassan Moon, Jennifer Adorno, Dayane Alfenas Reis, Sanjukta Bhanja

TL;DR

The paper tackles the real-time processing bottlenecks of Hierarchical Temporal Memory (HTM) in streaming IoT data by introducing Reflex Memory (RM), a lightweight, dictionary-based first-order inference mechanism. Building on RM, the authors present Accelerated HTM (AHTM) and hardware-accelerated CAM-enabled AHTM (H-AHTM) to achieve substantial inference speedups without compromising anomaly-detection accuracy. RM offloads repetitive, low-order patterns from the computationally heavy Sequence Memory, while CU dynamically balances RM and SM for robust online learning. Hardware integration via CAM (AFeCAM) enables sub-centisecond responsiveness, with reported speedups of up to 7.55× (AHTM) and 10.10× (H-AHTM) over baseline HTM; results on financial datasets confirm maintained accuracy and improved latency. The work demonstrates a scalable pathway to real-time neuromorphic anomaly detection and forecasting with online adaptation hardware-accelerated HTM variants.

Abstract

The rapid expansion of the Internet of Things (IoT) generates zettabytes of data that demand efficient unsupervised learning systems. Hierarchical Temporal Memory (HTM), a third-generation unsupervised AI algorithm, models the neocortex of the human brain by simulating columns of neurons to process and predict sequences. These neuron columns can memorize and infer sequences across multiple orders. While multiorder inferences offer robust predictive capabilities, they often come with significant computational overhead. The Sequence Memory (SM) component of HTM, which manages these inferences, encounters bottlenecks primarily due to its extensive programmable interconnects. In many cases, it has been observed that first-order temporal relationships have proven to be sufficient without any significant loss in efficiency. This paper introduces a Reflex Memory (RM) block, inspired by the Spinal Cord's working mechanisms, designed to accelerate the processing of first-order inferences. The RM block performs these inferences significantly faster than the SM. The integration of RM with HTM forms a system called the Accelerated Hierarchical Temporal Memory (AHTM), which processes repetitive information more efficiently than the original HTM while still supporting multiorder inferences. The experimental results demonstrate that the HTM predicts an event in 0.945 s, whereas the AHTM module does so in 0.125 s. Additionally, the hardware implementation of RM in a content-addressable memory (CAM) block, known as Hardware-Accelerated Hierarchical Temporal Memory (H-AHTM), predicts an event in just 0.094 s, significantly improving inference speed. Compared to the original algorithm \cite{bautista2020matlabhtm}, AHTM accelerates inference by up to 7.55x, while H-AHTM further enhances performance with a 10.10x speedup.

Enhancing Biologically Inspired Hierarchical Temporal Memory with Hardware-Accelerated Reflex Memory

TL;DR

The paper tackles the real-time processing bottlenecks of Hierarchical Temporal Memory (HTM) in streaming IoT data by introducing Reflex Memory (RM), a lightweight, dictionary-based first-order inference mechanism. Building on RM, the authors present Accelerated HTM (AHTM) and hardware-accelerated CAM-enabled AHTM (H-AHTM) to achieve substantial inference speedups without compromising anomaly-detection accuracy. RM offloads repetitive, low-order patterns from the computationally heavy Sequence Memory, while CU dynamically balances RM and SM for robust online learning. Hardware integration via CAM (AFeCAM) enables sub-centisecond responsiveness, with reported speedups of up to 7.55× (AHTM) and 10.10× (H-AHTM) over baseline HTM; results on financial datasets confirm maintained accuracy and improved latency. The work demonstrates a scalable pathway to real-time neuromorphic anomaly detection and forecasting with online adaptation hardware-accelerated HTM variants.

Abstract

The rapid expansion of the Internet of Things (IoT) generates zettabytes of data that demand efficient unsupervised learning systems. Hierarchical Temporal Memory (HTM), a third-generation unsupervised AI algorithm, models the neocortex of the human brain by simulating columns of neurons to process and predict sequences. These neuron columns can memorize and infer sequences across multiple orders. While multiorder inferences offer robust predictive capabilities, they often come with significant computational overhead. The Sequence Memory (SM) component of HTM, which manages these inferences, encounters bottlenecks primarily due to its extensive programmable interconnects. In many cases, it has been observed that first-order temporal relationships have proven to be sufficient without any significant loss in efficiency. This paper introduces a Reflex Memory (RM) block, inspired by the Spinal Cord's working mechanisms, designed to accelerate the processing of first-order inferences. The RM block performs these inferences significantly faster than the SM. The integration of RM with HTM forms a system called the Accelerated Hierarchical Temporal Memory (AHTM), which processes repetitive information more efficiently than the original HTM while still supporting multiorder inferences. The experimental results demonstrate that the HTM predicts an event in 0.945 s, whereas the AHTM module does so in 0.125 s. Additionally, the hardware implementation of RM in a content-addressable memory (CAM) block, known as Hardware-Accelerated Hierarchical Temporal Memory (H-AHTM), predicts an event in just 0.094 s, significantly improving inference speed. Compared to the original algorithm \cite{bautista2020matlabhtm}, AHTM accelerates inference by up to 7.55x, while H-AHTM further enhances performance with a 10.10x speedup.

Paper Structure

This paper contains 19 sections, 7 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Comparison of biological and HTM neurons, highlighting feedback, feedforward, and contextual connections in both systems.
  • Figure 2: Generic Hierarchical Temporal Memory (HTM) Architecture. The architecture consists of a data flow pipeline starting from a database, which provides input to the Encoder. The Encoder translates raw input into a Symmetrically Encoded Representation (E), capturing cyclic features for processing by the Spatial Pooler (SP). The Control Unit coordinates operations, guiding the output to either the Sequence Memory (SM) for temporal learning or Reflex Memory (RM) for hardware-accelerated first-order computations. This structure enables time-efficient spatio-temporal learning and accurate prediction.
  • Figure 3: The SP transforms the encoded input vector $E$ (blue bits) into a SDR $S$, which is then processed by the SM. Each Input/Encoding space bit can potentially connect to any minicolumns in the Spatial Pooler. In the visualization, colors indicate different states: Green circles represent active bits in the input space that overlap with the encoded input, while grey circles denote inactive bits that fall outside the encoded space. Green synapses represent active connections within the encoded vectors, whereas red synapses indicate inactive connections in the current iteration. An accumulator $(\Sigma)$ counts the number of active synapses contributing to the overlap score. If the overlap score exceeds the threshold of 43, the corresponding minicolumn in $SP$ is activated (ON).
  • Figure 4: Illustration of SM operations. Each mini-column (MC) contains multiple cells, where only one cell becomes active after learning. The diagram shows the transition from the state before learning (with multiple active cells in a column) to the state after learning (with a single predictive cell per column). Predictive cells represent anticipated sequences, while active cells correspond to the current input. This mechanism enables the Sequence Memory to capture temporal patterns and make accurate predictions.
  • Figure 5: Architecture of an (a) AFeCAM subarray (b) the AFeCAM cell design, and an (c) output register (Adapted from moon2024afecam).
  • ...and 5 more figures