Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines

Vojtech Halenka; Ahmed K. Kadhim; Paul F. A. Clarke; Bimal Bhattarai; Rupsa Saha; Ole-Christoffer Granmo; Lei Jiao; Per-Arne Andersen

Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines

Vojtech Halenka, Ahmed K. Kadhim, Paul F. A. Clarke, Bimal Bhattarai, Rupsa Saha, Ole-Christoffer Granmo, Lei Jiao, Per-Arne Andersen

TL;DR

The paper tackles the challenge of Booleanizing complex data for Tsetlin Machines by introducing the Hypervector Tsetlin Machine (HVTM), which operates in a high-dimensional hyperspace using sparse hypervectors. It details a workflow of tokenization, binding, bundling, hyperautomata, and hyperclauses that yield interpretable Yes/No decisions, with learning guided by standard Tsetlin Machine feedback. Empirical results across NLP, cheminformatics, and image domains show HVTM achieving higher accuracy and faster learning than standard TM under various hyperparameters, including improvements in IMDB (up to 89.67% with RbE) and MNIST-like tasks, and competitive results on HIV data. The work highlights the potential of hyperdimensional representations to expand Booleanization strategies, supports interpretability, and suggests hardware acceleration as a path toward scalable HVTM deployment with broader TM applications.

Abstract

Tsetlin machines (TMs) have been successful in several application domains, operating with high efficiency on Boolean representations of the input data. However, Booleanizing complex data structures such as sequences, graphs, images, signal spectra, chemical compounds, and natural language is not trivial. In this paper, we propose a hypervector (HV) based method for expressing arbitrarily large sets of concepts associated with any input data. Using a hyperdimensional space to build vectors drastically expands the capacity and flexibility of the TM. We demonstrate how images, chemical compounds, and natural language text are encoded according to the proposed method, and how the resulting HV-powered TM can achieve significantly higher accuracy and faster learning on well-known benchmarks. Our results open up a new research direction for TMs, namely how to expand and exploit the benefits of operating in hyperspace, including new booleanization strategies, optimization of TM inference and learning, as well as new TM applications.

Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines

TL;DR

Abstract

Paper Structure (15 sections, 4 equations, 18 figures, 1 table)

This paper contains 15 sections, 4 equations, 18 figures, 1 table.

Introduction
Hypervector Tsetlin Machine
Workflow of Hypervector Tsetlin Machine
Hypervectors
Creation of a hypervector
Explainability
Dimensionality and Storage Limitations
Projection Overlaps and Robustness
Scalability and Computational Complexity
Internal mechanism of HVTM
Empirical Results
Categorization of Natural Language Texts
Classification of Compounds in Cheminformatics
Image Classification
Conclusion

Figures (18)

Figure 1: Tsetlin machine operating in hyperspace
Figure 2: Creation of a Hypervector from 3 different hypervector tokens, bundling Patch with its respective Row and Column
Figure 3: Interpretation of exported clauses after training, with hyperliterals representing rows, columns and different patches
Figure 4: The learning of Tsetlin Machine for a sample of XOR gate.
Figure 5: Sparse HV with varying HVSize for IMDB
...and 13 more figures

Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines

TL;DR

Abstract

Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines

Authors

TL;DR

Abstract

Table of Contents

Figures (18)