Table of Contents
Fetching ...

HD-CB: The First Exploration of Hyperdimensional Computing for Contextual Bandits Problems

Marco Angioli, Antonello Rosato, Marcello Barbirotta, Rocco Martino, Francesco Menichelli, Mauro Olivieri

TL;DR

This work tackles online decision-making under context by introducing Hyperdimensional Contextual Bandits (HD-CB), the first integration of Hyperdimensional Computing with contextual bandit problems. HD-CB represents contexts and actions as high-dimensional hypervectors and replaces ridge regression with parallel vector operations, enabling faster convergence and reduced computational load. Four HD-CB variants are proposed—HD-CB_EPS and three uncertainty-driven variants with memory- and update-control optimizations—validated on synthetic data, off-policy OBP evaluation, and MovieLens-100k. Results show competitive or superior rewards with lower complexity, and the work highlights HD-CB as a scalable, hardware-friendly framework for resource-constrained sequential decision tasks.

Abstract

Hyperdimensional Computing (HDC), also known as Vector Symbolic Architectures, is a computing paradigm that combines the strengths of symbolic reasoning with the efficiency and scalability of distributed connectionist models in artificial intelligence. HDC has recently emerged as a promising alternative for performing learning tasks in resource-constrained environments thanks to its energy and computational efficiency, inherent parallelism, and resilience to noise and hardware faults. This work introduces the Hyperdimensional Contextual Bandits (HD-CB): the first exploration of HDC to model and automate sequential decision-making Contextual Bandits (CB) problems. The proposed approach maps environmental states in a high-dimensional space and represents each action with dedicated hypervectors (HVs). At each iteration, these HVs are used to select the optimal action for the given context and are updated based on the received reward, replacing computationally expensive ridge regression procedures required by traditional linear CB algorithms with simple, highly parallel vector operations. We propose four HD-CB variants, demonstrating their flexibility in implementing different exploration strategies, as well as techniques to reduce memory overhead and the number of hyperparameters. Extensive simulations on synthetic datasets and a real-world benchmark reveal that HD-CB consistently achieves competitive or superior performance compared to traditional linear CB algorithms, while offering faster convergence time, lower computational complexity, improved scalability, and high parallelism.

HD-CB: The First Exploration of Hyperdimensional Computing for Contextual Bandits Problems

TL;DR

This work tackles online decision-making under context by introducing Hyperdimensional Contextual Bandits (HD-CB), the first integration of Hyperdimensional Computing with contextual bandit problems. HD-CB represents contexts and actions as high-dimensional hypervectors and replaces ridge regression with parallel vector operations, enabling faster convergence and reduced computational load. Four HD-CB variants are proposed—HD-CB_EPS and three uncertainty-driven variants with memory- and update-control optimizations—validated on synthetic data, off-policy OBP evaluation, and MovieLens-100k. Results show competitive or superior rewards with lower complexity, and the work highlights HD-CB as a scalable, hardware-friendly framework for resource-constrained sequential decision tasks.

Abstract

Hyperdimensional Computing (HDC), also known as Vector Symbolic Architectures, is a computing paradigm that combines the strengths of symbolic reasoning with the efficiency and scalability of distributed connectionist models in artificial intelligence. HDC has recently emerged as a promising alternative for performing learning tasks in resource-constrained environments thanks to its energy and computational efficiency, inherent parallelism, and resilience to noise and hardware faults. This work introduces the Hyperdimensional Contextual Bandits (HD-CB): the first exploration of HDC to model and automate sequential decision-making Contextual Bandits (CB) problems. The proposed approach maps environmental states in a high-dimensional space and represents each action with dedicated hypervectors (HVs). At each iteration, these HVs are used to select the optimal action for the given context and are updated based on the received reward, replacing computationally expensive ridge regression procedures required by traditional linear CB algorithms with simple, highly parallel vector operations. We propose four HD-CB variants, demonstrating their flexibility in implementing different exploration strategies, as well as techniques to reduce memory overhead and the number of hyperparameters. Extensive simulations on synthetic datasets and a real-world benchmark reveal that HD-CB consistently achieves competitive or superior performance compared to traditional linear CB algorithms, while offering faster convergence time, lower computational complexity, improved scalability, and high parallelism.

Paper Structure

This paper contains 19 sections, 14 equations, 9 figures, 2 tables, 2 algorithms.

Figures (9)

  • Figure 1: Schematic representation of contextual bandits problems. At each iteration, the algorithms observe a context, pick an action and receive a reward related to the chosen option.
  • Figure 2: Schematic representation of the encoding unit. Each feature index of $x$ is represented by a dedicated random base-vector, while the corresponding values are discretized and mapped to level-vectors. The binding creates feature id-value pairs, while bundling produces the final encoded HV, $\mathcal{X}$
  • Figure 3: Schematic illustration of the fundamental working principle of HD-CB. The collection of encoded hypervectors $\mathcal{X}_{t,a}$ for all actions $a$ at time step $t$ is denoted as $\mathbf{\mathcal{X}_{t}}$, while the set of estimated payoffs $e_{t,a}$ for all actions is represented as $\mathbf{e_{t}}$.
  • Figure 4: Schematic of HD-CB$_{\text{UNC1}}$. Each action $a$ is associated with a confidence vector $\mathcal{B}_a$. The vectors $\mathbf{\mathcal{X}_{t}}$, $\mathbf{e_t}$, $\mathbf{u_t}$, and $\mathbf{p_t}$ represent the context, estimated payoffs, uncertainties, and potential for all actions at time step $t$.
  • Figure 5: Schematic of HD-CB$_{\text{UNC3}}$. Memory overhead is reduced by using only two global HVs and leveraging the mathematical properties of permutation.
  • ...and 4 more figures