Table of Contents
Fetching ...

Knowledge Fusion via Bidirectional Information Aggregation

Songlin Zhai, Guilin Qi, Yue Wang, Yuan Meng

TL;DR

KGA introduces a neuroscience-inspired, parameter-free framework that rethinks Transformer self-attention to fuse external KG knowledge into LLMs at inference time. By implementing two synergistic pathways—bottom-up knowledge fusion and top-down attention guidance—KGA achieves adaptive, real-time integration of KG triples without modifying model parameters. The approach demonstrates strong performance across KGQA, KG reasoning, and KG-based model editing with improved efficiency and robustness to noise, while preserving the LLM’s general capabilities. This yields practical, scalable knowledge-aware generation suitable for dynamic web-scale KG environments. The results underscore KGA’s potential to enable reliable, up-to-date knowledge integration in real-time applications without expensive retraining or complex retrieval stacks.

Abstract

Knowledge graphs (KGs) are the cornerstone of the semantic web, offering up-to-date representations of real-world entities and relations. Yet large language models (LLMs) remain largely static after pre-training, causing their internal knowledge to become outdated and limiting their utility in time-sensitive web applications. To bridge this gap between dynamic knowledge and static models, a prevalent approach is to enhance LLMs with KGs. However, prevailing methods typically rely on parameter-invasive fine-tuning, which risks catastrophic forgetting and often degrades LLMs' general capabilities. Moreover, their static integration frameworks cannot keep pace with the continuous evolution of real-world KGs, hindering their deployment in dynamic web environments. To bridge this gap, we introduce KGA (\textit{\underline{K}nowledge \underline{G}raph-guided \underline{A}ttention}), a novel framework that dynamically integrates external KGs into LLMs exclusively at inference-time without any parameter modification. Inspired by research on neuroscience, we rewire the self-attention module by innovatively introducing two synergistic pathways: a \textit{bottom-up knowledge fusion} pathway and a \textit{top-down attention guidance} pathway. The \textit{bottom-up pathway} dynamically integrates external knowledge into input representations via input-driven KG fusion, which is akin to the \textit{stimulus-driven attention process} in the human brain. Complementarily, the \textit{top-down pathway} aims to assess the contextual relevance of each triple through a \textit{goal-directed verification process}, thereby suppressing task-irrelevant signals and amplifying knowledge-relevant patterns. By synergistically combining these two pathways, our method supports real-time knowledge fusion. Extensive experiments on four benchmarks verify KGA's strong fusion performance and efficiency.

Knowledge Fusion via Bidirectional Information Aggregation

TL;DR

KGA introduces a neuroscience-inspired, parameter-free framework that rethinks Transformer self-attention to fuse external KG knowledge into LLMs at inference time. By implementing two synergistic pathways—bottom-up knowledge fusion and top-down attention guidance—KGA achieves adaptive, real-time integration of KG triples without modifying model parameters. The approach demonstrates strong performance across KGQA, KG reasoning, and KG-based model editing with improved efficiency and robustness to noise, while preserving the LLM’s general capabilities. This yields practical, scalable knowledge-aware generation suitable for dynamic web-scale KG environments. The results underscore KGA’s potential to enable reliable, up-to-date knowledge integration in real-time applications without expensive retraining or complex retrieval stacks.

Abstract

Knowledge graphs (KGs) are the cornerstone of the semantic web, offering up-to-date representations of real-world entities and relations. Yet large language models (LLMs) remain largely static after pre-training, causing their internal knowledge to become outdated and limiting their utility in time-sensitive web applications. To bridge this gap between dynamic knowledge and static models, a prevalent approach is to enhance LLMs with KGs. However, prevailing methods typically rely on parameter-invasive fine-tuning, which risks catastrophic forgetting and often degrades LLMs' general capabilities. Moreover, their static integration frameworks cannot keep pace with the continuous evolution of real-world KGs, hindering their deployment in dynamic web environments. To bridge this gap, we introduce KGA (\textit{\underline{K}nowledge \underline{G}raph-guided \underline{A}ttention}), a novel framework that dynamically integrates external KGs into LLMs exclusively at inference-time without any parameter modification. Inspired by research on neuroscience, we rewire the self-attention module by innovatively introducing two synergistic pathways: a \textit{bottom-up knowledge fusion} pathway and a \textit{top-down attention guidance} pathway. The \textit{bottom-up pathway} dynamically integrates external knowledge into input representations via input-driven KG fusion, which is akin to the \textit{stimulus-driven attention process} in the human brain. Complementarily, the \textit{top-down pathway} aims to assess the contextual relevance of each triple through a \textit{goal-directed verification process}, thereby suppressing task-irrelevant signals and amplifying knowledge-relevant patterns. By synergistically combining these two pathways, our method supports real-time knowledge fusion. Extensive experiments on four benchmarks verify KGA's strong fusion performance and efficiency.

Paper Structure

This paper contains 29 sections, 6 equations, 10 figures, 3 tables, 1 algorithm.

Figures (10)

  • Figure 1: Motivation of why do we choose to rewire the "self"-attention, only expand part of the token-token interaction.
  • Figure 2: Hidden Transformer layer equipped with KGA.
  • Figure 3: Framework of the proposed KGA, where the bottom-up and top-down pathways are introduced in it.
  • Figure 4: Effects of adaptive weighting on four datasets, where KGA* denotes the version of KGA without top-down pathway.
  • Figure 5: Analysis of Top-Down Pathway: effects of temperature on four datasets for Llama3.2 (1B).
  • ...and 5 more figures