Table of Contents
Fetching ...

HDReason: Algorithm-Hardware Codesign for Hyperdimensional Knowledge Graph Reasoning

Hanning Chen, Yang Ni, Ali Zakeri, Zhuowen Zou, Sanggeon Yun, Fei Wen, Behnam Khaleghi, Narayan Srinivasa, Hugo Latapie, Mohsen Imani

TL;DR

HDReason introduces an algorithm-hardware co-design for Knowledge Graph Completion by leveraging Hyperdimensional Computing to replace bulky GCN-based reasoning with efficient, interpretable hypervector operations. The HDL-based FPGA framework coalesces a Hyperspace Encoder, Memorization-based neighbor aggregation, and a TransE-like scoring backend, enabled by a density-aware CPU scheduler and forward/backward co-optimization. Empirical results across large KG datasets show HDReason delivering substantial speedups and energy efficiency gains over RTX-class GPUs (average 10.6x speedup, 65x energy efficiency) and competitive accuracy against state-of-the-art GCN training platforms, with robust performance under dimension reduction and quantization. This work demonstrates that algorithm-hardware co-design with HDC can achieve end-to-end, energy-efficient KGC on programmable accelerators, supporting scalable reasoning for real-world KG applications.

Abstract

In recent times, a plethora of hardware accelerators have been put forth for graph learning applications such as vertex classification and graph classification. However, previous works have paid little attention to Knowledge Graph Completion (KGC), a task that is well-known for its significantly higher algorithm complexity. The state-of-the-art KGC solutions based on graph convolution neural network (GCN) involve extensive vertex/relation embedding updates and complicated score functions, which are inherently cumbersome for acceleration. As a result, existing accelerator designs are no longer optimal, and a novel algorithm-hardware co-design for KG reasoning is needed. Recently, brain-inspired HyperDimensional Computing (HDC) has been introduced as a promising solution for lightweight machine learning, particularly for graph learning applications. In this paper, we leverage HDC for an intrinsically more efficient and acceleration-friendly KGC algorithm. We also co-design an acceleration framework named HDReason targeting FPGA platforms. On the algorithm level, HDReason achieves a balance between high reasoning accuracy, strong model interpretability, and less computation complexity. In terms of architecture, HDReason offers reconfigurability, high training throughput, and low energy consumption. When compared with NVIDIA RTX 4090 GPU, the proposed accelerator achieves an average 10.6x speedup and 65x energy efficiency improvement. When conducting cross-models and cross-platforms comparison, HDReason yields an average 4.2x higher performance and 3.4x better energy efficiency with similar accuracy versus the state-of-the-art FPGA-based GCN training platform.

HDReason: Algorithm-Hardware Codesign for Hyperdimensional Knowledge Graph Reasoning

TL;DR

HDReason introduces an algorithm-hardware co-design for Knowledge Graph Completion by leveraging Hyperdimensional Computing to replace bulky GCN-based reasoning with efficient, interpretable hypervector operations. The HDL-based FPGA framework coalesces a Hyperspace Encoder, Memorization-based neighbor aggregation, and a TransE-like scoring backend, enabled by a density-aware CPU scheduler and forward/backward co-optimization. Empirical results across large KG datasets show HDReason delivering substantial speedups and energy efficiency gains over RTX-class GPUs (average 10.6x speedup, 65x energy efficiency) and competitive accuracy against state-of-the-art GCN training platforms, with robust performance under dimension reduction and quantization. This work demonstrates that algorithm-hardware co-design with HDC can achieve end-to-end, energy-efficient KGC on programmable accelerators, supporting scalable reasoning for real-world KG applications.

Abstract

In recent times, a plethora of hardware accelerators have been put forth for graph learning applications such as vertex classification and graph classification. However, previous works have paid little attention to Knowledge Graph Completion (KGC), a task that is well-known for its significantly higher algorithm complexity. The state-of-the-art KGC solutions based on graph convolution neural network (GCN) involve extensive vertex/relation embedding updates and complicated score functions, which are inherently cumbersome for acceleration. As a result, existing accelerator designs are no longer optimal, and a novel algorithm-hardware co-design for KG reasoning is needed. Recently, brain-inspired HyperDimensional Computing (HDC) has been introduced as a promising solution for lightweight machine learning, particularly for graph learning applications. In this paper, we leverage HDC for an intrinsically more efficient and acceleration-friendly KGC algorithm. We also co-design an acceleration framework named HDReason targeting FPGA platforms. On the algorithm level, HDReason achieves a balance between high reasoning accuracy, strong model interpretability, and less computation complexity. In terms of architecture, HDReason offers reconfigurability, high training throughput, and low energy consumption. When compared with NVIDIA RTX 4090 GPU, the proposed accelerator achieves an average 10.6x speedup and 65x energy efficiency improvement. When conducting cross-models and cross-platforms comparison, HDReason yields an average 4.2x higher performance and 3.4x better energy efficiency with similar accuracy versus the state-of-the-art FPGA-based GCN training platform.
Paper Structure (26 sections, 15 equations, 11 figures, 6 tables)

This paper contains 26 sections, 15 equations, 11 figures, 6 tables.

Figures (11)

  • Figure 1: (a) HDC encoding example. (b) HDC memorization in graph learning. (c) Vertex neighbor reconstruction. (d) Score function example, TransE.
  • Figure 2: (a) KG example. (b) Overview of HDReason.
  • Figure 3: CPU-FPGA acceleration platform overview.
  • Figure 4: Balanced computation scheduling example. CSR means compressed sparse row and OoO means out of order.
  • Figure 5: The encoder architecture design. Systolic Array encode embedding vector from normal space into hyperspace. Dispatcher IP dynamically load encoded hypervectors from off-chip memory into on-chip memory. Memorization Computing IP simultaneously execute the forward memorization and backward gradient computation
  • ...and 6 more figures