Table of Contents
Fetching ...

Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines

Vojtech Halenka, Ahmed K. Kadhim, Paul F. A. Clarke, Bimal Bhattarai, Rupsa Saha, Ole-Christoffer Granmo, Lei Jiao, Per-Arne Andersen

TL;DR

The paper tackles the challenge of Booleanizing complex data for Tsetlin Machines by introducing the Hypervector Tsetlin Machine (HVTM), which operates in a high-dimensional hyperspace using sparse hypervectors. It details a workflow of tokenization, binding, bundling, hyperautomata, and hyperclauses that yield interpretable Yes/No decisions, with learning guided by standard Tsetlin Machine feedback. Empirical results across NLP, cheminformatics, and image domains show HVTM achieving higher accuracy and faster learning than standard TM under various hyperparameters, including improvements in IMDB (up to 89.67% with RbE) and MNIST-like tasks, and competitive results on HIV data. The work highlights the potential of hyperdimensional representations to expand Booleanization strategies, supports interpretability, and suggests hardware acceleration as a path toward scalable HVTM deployment with broader TM applications.

Abstract

Tsetlin machines (TMs) have been successful in several application domains, operating with high efficiency on Boolean representations of the input data. However, Booleanizing complex data structures such as sequences, graphs, images, signal spectra, chemical compounds, and natural language is not trivial. In this paper, we propose a hypervector (HV) based method for expressing arbitrarily large sets of concepts associated with any input data. Using a hyperdimensional space to build vectors drastically expands the capacity and flexibility of the TM. We demonstrate how images, chemical compounds, and natural language text are encoded according to the proposed method, and how the resulting HV-powered TM can achieve significantly higher accuracy and faster learning on well-known benchmarks. Our results open up a new research direction for TMs, namely how to expand and exploit the benefits of operating in hyperspace, including new booleanization strategies, optimization of TM inference and learning, as well as new TM applications.

Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines

TL;DR

The paper tackles the challenge of Booleanizing complex data for Tsetlin Machines by introducing the Hypervector Tsetlin Machine (HVTM), which operates in a high-dimensional hyperspace using sparse hypervectors. It details a workflow of tokenization, binding, bundling, hyperautomata, and hyperclauses that yield interpretable Yes/No decisions, with learning guided by standard Tsetlin Machine feedback. Empirical results across NLP, cheminformatics, and image domains show HVTM achieving higher accuracy and faster learning than standard TM under various hyperparameters, including improvements in IMDB (up to 89.67% with RbE) and MNIST-like tasks, and competitive results on HIV data. The work highlights the potential of hyperdimensional representations to expand Booleanization strategies, supports interpretability, and suggests hardware acceleration as a path toward scalable HVTM deployment with broader TM applications.

Abstract

Tsetlin machines (TMs) have been successful in several application domains, operating with high efficiency on Boolean representations of the input data. However, Booleanizing complex data structures such as sequences, graphs, images, signal spectra, chemical compounds, and natural language is not trivial. In this paper, we propose a hypervector (HV) based method for expressing arbitrarily large sets of concepts associated with any input data. Using a hyperdimensional space to build vectors drastically expands the capacity and flexibility of the TM. We demonstrate how images, chemical compounds, and natural language text are encoded according to the proposed method, and how the resulting HV-powered TM can achieve significantly higher accuracy and faster learning on well-known benchmarks. Our results open up a new research direction for TMs, namely how to expand and exploit the benefits of operating in hyperspace, including new booleanization strategies, optimization of TM inference and learning, as well as new TM applications.
Paper Structure (15 sections, 4 equations, 18 figures, 1 table)

This paper contains 15 sections, 4 equations, 18 figures, 1 table.

Figures (18)

  • Figure 1: Tsetlin machine operating in hyperspace
  • Figure 2: Creation of a Hypervector from 3 different hypervector tokens, bundling Patch with its respective Row and Column
  • Figure 3: Interpretation of exported clauses after training, with hyperliterals representing rows, columns and different patches
  • Figure 4: The learning of Tsetlin Machine for a sample of XOR gate.
  • Figure 5: Sparse HV with varying HVSize for IMDB
  • ...and 13 more figures