Are Hypervectors Enough? Single-Call LLM Reasoning over Knowledge Graphs
Yezi Liu, William Youngwoo Chung, Hanning Chen, Calvin Yeung, Mohsen Imani
TL;DR
The paper tackles inefficiencies in knowledge-graph question answering (KGQA) pipelines that rely on heavy neural encoders or numerous LLM calls for path scoring. It introduces PathHD, an encoder-free KG reasoning framework that encodes relation paths as Generalized Holographic Reduced Representation (GHRR) hypervectors, retrieves a small Top-K set via simple vector operations, and makes a single one-shot LLM adjudication over the shortlisted paths. Across WebQSP, CWQ, and GrailQA, PathHD yields competitive Hits@1 and F1 while substantially reducing end-to-end latency and GPU memory, and it provides path-grounded rationales for interpretability. The results demonstrate that carefully designed hyperdimensional representations can serve as an efficient, scalable substrate for KG-LLM reasoning with favorable accuracy-efficiency-interpretability trade-offs, and point to broader applicability in domain-specific graphs and real-time decision support.
Abstract
Recent advances in large language models (LLMs) have enabled strong reasoning over both structured and unstructured knowledge. When grounded on knowledge graphs (KGs), however, prevailing pipelines rely on heavy neural encoders to embed and score symbolic paths or on repeated LLM calls to rank candidates, leading to high latency, GPU cost, and opaque decisions that hinder faithful, scalable deployment. We propose PathHD, a lightweight and encoder-free KG reasoning framework that replaces neural path scoring with hyperdimensional computing (HDC) and uses only a single LLM call per query. PathHD encodes relation paths into block-diagonal GHRR hypervectors, ranks candidates with blockwise cosine similarity and Top-K pruning, and then performs a one-shot LLM adjudication to produce the final answer together with cited supporting paths. Technically, PathHD is built on three ingredients: (i) an order-aware, non-commutative binding operator for path composition, (ii) a calibrated similarity for robust hypervector-based retrieval, and (iii) a one-shot adjudication step that preserves interpretability while eliminating per-path LLM scoring. On WebQSP, CWQ, and the GrailQA split, PathHD (i) attains comparable or better Hits@1 than strong neural baselines while using one LLM call per query; (ii) reduces end-to-end latency by $40-60\%$ and GPU memory by $3-5\times$ thanks to encoder-free retrieval; and (iii) delivers faithful, path-grounded rationales that improve error diagnosis and controllability. These results indicate that carefully designed HDC representations provide a practical substrate for efficient KG-LLM reasoning, offering a favorable accuracy-efficiency-interpretability trade-off.
