Table of Contents
Fetching ...

Hyperdimensional Vector Tsetlin Machines with Applications to Sequence Learning and Generation

Christian D. Blakely

TL;DR

This work introduces a hybrid HVTM architecture that combines Hyperdimensional Vector Computing with Tsetlin Machines to learn, classify, forecast, and generate sequences. By encoding sequences into high-dimensional vectors using interval embeddings and N-Grams, and by leveraging associative memory for rapid retrieval, the approach achieves competitive time-series classification on the UCR Archive while retaining a lightweight memory footprint. The authors also demonstrate forecasting and generation capabilities with a two-step HV-based pipeline and provide extensive numerical experiments on harmonic and stochastic sequences, highlighting the benefits of 5-Gram encoding. Overall, the HVTM framework offers a fast, interpretable, and memory-efficient alternative to deep learning sequence models, well-suited for online/embedded settings.

Abstract

We construct a two-layered model for learning and generating sequential data that is both computationally fast and competitive with vanilla Tsetlin machines, adding numerous advantages. Through the use of hyperdimensional vector computing (HVC) algebras and Tsetlin machine clause structures, we demonstrate that the combination of both inherits the generality of data encoding and decoding of HVC with the fast interpretable nature of Tsetlin machines to yield a powerful machine learning model. We apply the approach in two areas, namely in forecasting, generating new sequences, and classification. For the latter, we derive results for the entire UCR Time Series Archive and compare with the standard benchmarks to see how well the method competes in time series classification.

Hyperdimensional Vector Tsetlin Machines with Applications to Sequence Learning and Generation

TL;DR

This work introduces a hybrid HVTM architecture that combines Hyperdimensional Vector Computing with Tsetlin Machines to learn, classify, forecast, and generate sequences. By encoding sequences into high-dimensional vectors using interval embeddings and N-Grams, and by leveraging associative memory for rapid retrieval, the approach achieves competitive time-series classification on the UCR Archive while retaining a lightweight memory footprint. The authors also demonstrate forecasting and generation capabilities with a two-step HV-based pipeline and provide extensive numerical experiments on harmonic and stochastic sequences, highlighting the benefits of 5-Gram encoding. Overall, the HVTM framework offers a fast, interpretable, and memory-efficient alternative to deep learning sequence models, well-suited for online/embedded settings.

Abstract

We construct a two-layered model for learning and generating sequential data that is both computationally fast and competitive with vanilla Tsetlin machines, adding numerous advantages. Through the use of hyperdimensional vector computing (HVC) algebras and Tsetlin machine clause structures, we demonstrate that the combination of both inherits the generality of data encoding and decoding of HVC with the fast interpretable nature of Tsetlin machines to yield a powerful machine learning model. We apply the approach in two areas, namely in forecasting, generating new sequences, and classification. For the latter, we derive results for the entire UCR Time Series Archive and compare with the standard benchmarks to see how well the method competes in time series classification.
Paper Structure (18 sections, 4 equations, 13 figures, 1 table, 2 algorithms)

This paper contains 18 sections, 4 equations, 13 figures, 1 table, 2 algorithms.

Figures (13)

  • Figure 1: The procedure for encoding a sequence into a hypervector.
  • Figure 2: Showing a collection of clauses and their relationship with weights that are learned during TM training. Each clause contains a set of literals that are also learned during feedback.
  • Figure 3: Input into the TM is the N-Gram encoded latest $N$ values of the sequence. This input is used to predict the next value governed by the $Q$-class TM
  • Figure 4: Scatter plot showing the accuracy of the performance on each data set versus the benchmark. Here the X-axis is the benchmark accuracy and the Y-axis is the HVTM accuracy
  • Figure 5: Mean accuracy and (out)performance results into four categories: data type, length of time series
  • ...and 8 more figures