Table of Contents
Fetching ...

Are Hypervectors Enough? Single-Call LLM Reasoning over Knowledge Graphs

Yezi Liu, William Youngwoo Chung, Hanning Chen, Calvin Yeung, Mohsen Imani

TL;DR

The paper tackles inefficiencies in knowledge-graph question answering (KGQA) pipelines that rely on heavy neural encoders or numerous LLM calls for path scoring. It introduces PathHD, an encoder-free KG reasoning framework that encodes relation paths as Generalized Holographic Reduced Representation (GHRR) hypervectors, retrieves a small Top-K set via simple vector operations, and makes a single one-shot LLM adjudication over the shortlisted paths. Across WebQSP, CWQ, and GrailQA, PathHD yields competitive Hits@1 and F1 while substantially reducing end-to-end latency and GPU memory, and it provides path-grounded rationales for interpretability. The results demonstrate that carefully designed hyperdimensional representations can serve as an efficient, scalable substrate for KG-LLM reasoning with favorable accuracy-efficiency-interpretability trade-offs, and point to broader applicability in domain-specific graphs and real-time decision support.

Abstract

Recent advances in large language models (LLMs) have enabled strong reasoning over both structured and unstructured knowledge. When grounded on knowledge graphs (KGs), however, prevailing pipelines rely on heavy neural encoders to embed and score symbolic paths or on repeated LLM calls to rank candidates, leading to high latency, GPU cost, and opaque decisions that hinder faithful, scalable deployment. We propose PathHD, a lightweight and encoder-free KG reasoning framework that replaces neural path scoring with hyperdimensional computing (HDC) and uses only a single LLM call per query. PathHD encodes relation paths into block-diagonal GHRR hypervectors, ranks candidates with blockwise cosine similarity and Top-K pruning, and then performs a one-shot LLM adjudication to produce the final answer together with cited supporting paths. Technically, PathHD is built on three ingredients: (i) an order-aware, non-commutative binding operator for path composition, (ii) a calibrated similarity for robust hypervector-based retrieval, and (iii) a one-shot adjudication step that preserves interpretability while eliminating per-path LLM scoring. On WebQSP, CWQ, and the GrailQA split, PathHD (i) attains comparable or better Hits@1 than strong neural baselines while using one LLM call per query; (ii) reduces end-to-end latency by $40-60\%$ and GPU memory by $3-5\times$ thanks to encoder-free retrieval; and (iii) delivers faithful, path-grounded rationales that improve error diagnosis and controllability. These results indicate that carefully designed HDC representations provide a practical substrate for efficient KG-LLM reasoning, offering a favorable accuracy-efficiency-interpretability trade-off.

Are Hypervectors Enough? Single-Call LLM Reasoning over Knowledge Graphs

TL;DR

The paper tackles inefficiencies in knowledge-graph question answering (KGQA) pipelines that rely on heavy neural encoders or numerous LLM calls for path scoring. It introduces PathHD, an encoder-free KG reasoning framework that encodes relation paths as Generalized Holographic Reduced Representation (GHRR) hypervectors, retrieves a small Top-K set via simple vector operations, and makes a single one-shot LLM adjudication over the shortlisted paths. Across WebQSP, CWQ, and GrailQA, PathHD yields competitive Hits@1 and F1 while substantially reducing end-to-end latency and GPU memory, and it provides path-grounded rationales for interpretability. The results demonstrate that carefully designed hyperdimensional representations can serve as an efficient, scalable substrate for KG-LLM reasoning with favorable accuracy-efficiency-interpretability trade-offs, and point to broader applicability in domain-specific graphs and real-time decision support.

Abstract

Recent advances in large language models (LLMs) have enabled strong reasoning over both structured and unstructured knowledge. When grounded on knowledge graphs (KGs), however, prevailing pipelines rely on heavy neural encoders to embed and score symbolic paths or on repeated LLM calls to rank candidates, leading to high latency, GPU cost, and opaque decisions that hinder faithful, scalable deployment. We propose PathHD, a lightweight and encoder-free KG reasoning framework that replaces neural path scoring with hyperdimensional computing (HDC) and uses only a single LLM call per query. PathHD encodes relation paths into block-diagonal GHRR hypervectors, ranks candidates with blockwise cosine similarity and Top-K pruning, and then performs a one-shot LLM adjudication to produce the final answer together with cited supporting paths. Technically, PathHD is built on three ingredients: (i) an order-aware, non-commutative binding operator for path composition, (ii) a calibrated similarity for robust hypervector-based retrieval, and (iii) a one-shot adjudication step that preserves interpretability while eliminating per-path LLM scoring. On WebQSP, CWQ, and the GrailQA split, PathHD (i) attains comparable or better Hits@1 than strong neural baselines while using one LLM call per query; (ii) reduces end-to-end latency by and GPU memory by thanks to encoder-free retrieval; and (iii) delivers faithful, path-grounded rationales that improve error diagnosis and controllability. These results indicate that carefully designed HDC representations provide a practical substrate for efficient KG-LLM reasoning, offering a favorable accuracy-efficiency-interpretability trade-off.

Paper Structure

This paper contains 57 sections, 5 theorems, 24 equations, 6 figures, 15 tables, 1 algorithm.

Key Result

Proposition 1

Let $\{\mathbf{v}_r\}$ be i.i.d. GHRR hypervectors with zero-mean, unit Frobenius-norm blocks. For a query path $z_q$ and any distractor $z\neq z_q$ encoded via non-commutative binding, the cosine similarity $X=\mathrm{sim}(\mathbf{v}_{z_q},\mathbf{v}_{z})$ (eq:block-cos) satisfies, for any $\epsilo for an absolute constant $c>0$ depending only on the sub-Gaussian proxy of entries.

Figures (6)

  • Figure 1: Two major pain points in KG-LLM reasoning.Left: A path-based KGQA system selects a candidate path whose relation sequence does not match the query relation ("acquired_by"), leading to an incorrect answer even though the KG contains the correct evidence. Right: LLM-based scoring evaluates each candidate path in a separate LLM call, which is sequential, hard to parallelize, and incurs high latency and token cost as the candidate set grows.
  • Figure 2: Overview of PathHD: a Plan$\rightarrow$Encode$\rightarrow$Retrieve$\rightarrow$Reason pipeline. A schema-based planner first generates relation plans over the KG; PathHD encodes these plans and instantiates candidate paths into order-aware GHRR hypervectors, ranks candidates with blockwise cosine similarity and Top-$K$ selection, and then issues a single LLM adjudication call to answer with cited paths, with most computation handled by vector operations and modest LLM use.
  • Figure 3: Visualization of performance and latency. The x-axis is Hits@$1$ (%), the y-axis is per-query latency in seconds (median, log scale). Bubble size indicates the average number of LLM calls; marker shape denotes the method family. PathHD gives strong accuracy with lower latency than multi-call LLMs+KG baselines.
  • Figure 4: Hypervector dimension study. Each panel reports F1 (%) of PathHD on WebQSP, CWQ, and GrailQA as a function of the hypervector dimension. Overall, performance rises from $512$ to the mid-range and then tapers off: WebQSP and GrailQA peak around 3k–4k, while CWQ prefers a slightly larger size (6k), after which F1 decreases mildly.
  • Figure 5: Scoring measurement ablation. We evaluate F1 (%) on WebQSP, CWQ, and GrailQA using different scoring strategies in our model. PathHD achieves the best or competitive results when using blockwise cosine similarity, highlighting its effectiveness in capturing fine-grained matching signals across vector blocks.
  • ...and 1 more figures

Theorems & Definitions (10)

  • Proposition 1: Near-orthogonality and distractor bound
  • proof : Proof sketch
  • Corollary 1: Capacity with union bound
  • Proposition 2: Near-orthogonality of random hypervectors
  • proof
  • Lemma 1: Closure under binding
  • proof
  • Theorem 1: Separation and error bound for hypervector retrieval
  • proof
  • proof : Details for Prop. \ref{['prop:near-orth']}