Hyperdimensional computing: a fast, robust and interpretable paradigm for biological data
Michiel Stock, Dimitri Boeckaerts, Pieter Dewulf, Steff Taelman, Maxime Van Haeverbeke, Wim Van Criekinge, Bernard De Baets
TL;DR
This paper addresses the limitations of deep learning in bioinformatics—namely, data hunger and limited interpretability—by advocating hyperdimensional computing (HDC) as a fast, interpretable alternative. It introduces hypervectors and a small set of operations (generating, bundling, binding, permutation) to encode and manipulate complex biological concepts, and surveys encoding strategies for sequences, graphs, and omics data while outlining learning workflows. The authors articulate four major opportunities for bioinformatics: fast, efficient computation; explainability through reversible operations; seamless multimodal data fusion; and symbolic, hierarchical representations that support structured reasoning, including potential in phylogeny and genetic engineering. They argue that HDC can complement deep learning, enabling scalable, explainable analyses across diverse data types, with hardware-aware implementations and hybrid neuro-symbolic models as promising directions. Overall, HDC offers a lightweight, versatile paradigm that can augment current DL approaches for faster, more transparent bioinformatics analyses across omics, biosignals, and health applications.
Abstract
Advances in bioinformatics are primarily due to new algorithms for processing diverse biological data sources. While sophisticated alignment algorithms have been pivotal in analyzing biological sequences, deep learning has substantially transformed bioinformatics, addressing sequence, structure, and functional analyses. However, these methods are incredibly data-hungry, compute-intensive and hard to interpret. Hyperdimensional computing (HDC) has recently emerged as an intriguing alternative. The key idea is that random vectors of high dimensionality can represent concepts such as sequence identity or phylogeny. These vectors can then be combined using simple operators for learning, reasoning or querying by exploiting the peculiar properties of high-dimensional spaces. Our work reviews and explores the potential of HDC for bioinformatics, emphasizing its efficiency, interpretability, and adeptness in handling multimodal and structured data. HDC holds a lot of potential for various omics data searching, biosignal analysis and health applications.
