Exploring State Space and Reasoning by Elimination in Tsetlin Machines
Ahmed K. Kadhim, Ole-Christoffer Granmo, Lei Jiao, Rishad Shafik
TL;DR
This work extends Tsetlin Machines with a word-level autoencoder (TM-AE) to produce dense, context-aware embeddings for individual words by training on token-centric rounds and a doubled literal space. It leverages Reasoning by Elimination (RbE) through feature negations and tunes hyperparameters $s$ (forgetting rate) and $T$ (voting margin) to control state-space distribution and clause formation, producing more informative per-word representations. Empirical results on artificial data, IMDB, and 20 Newsgroups show robust accuracy, with IMDB approaching 0.90, and demonstrate that careful RbE configuration improves predictive performance. The findings highlight a principled, interpretable alternative to conventional embeddings, with potential for improved context capture and explainability in NLP tasks.
Abstract
The Tsetlin Machine (TM) has gained significant attention in Machine Learning (ML). By employing logical fundamentals, it facilitates pattern learning and representation, offering an alternative approach for developing comprehensible Artificial Intelligence (AI) with a specific focus on pattern classification in the form of conjunctive clauses. In the domain of Natural Language Processing (NLP), TM is utilised to construct word embedding and describe target words using clauses. To enhance the descriptive capacity of these clauses, we study the concept of Reasoning by Elimination (RbE) in clauses' formulation, which involves incorporating feature negations to provide a more comprehensive representation. In more detail, this paper employs the Tsetlin Machine Auto-Encoder (TM-AE) architecture to generate dense word vectors, aiming at capturing contextual information by extracting feature-dense vectors for a given vocabulary. Thereafter, the principle of RbE is explored to improve descriptivity and optimise the performance of the TM. Specifically, the specificity parameter s and the voting margin parameter T are leveraged to regulate feature distribution in the state space, resulting in a dense representation of information for each clause. In addition, we investigate the state spaces of TM-AE, especially for the forgotten/excluded features. Empirical investigations on artificially generated data, the IMDB dataset, and the 20 Newsgroups dataset showcase the robustness of the TM, with accuracy reaching 90.62\% for the IMDB.
