Table of Contents
Fetching ...

DySK-Attn: A Framework for Efficient, Real-Time Knowledge Updating in Large Language Models via Dynamic Sparse Knowledge Attention

Kabir Khan, Priya Sharma, Arjun Mehta, Neha Gupta, Ravi Narayanan

TL;DR

DySK-Attn is a novel framework that enables LLMs to efficiently integrate real-time knowledge from a dynamic external source, and synergizes an LLM with a dynamic Knowledge Graph (KG) that can be updated instantaneously.

Abstract

Large Language Models (LLMs) suffer from a critical limitation: their knowledge is static and quickly becomes outdated. Retraining these massive models is computationally prohibitive, while existing knowledge editing techniques can be slow and may introduce unforeseen side effects. To address this, we propose DySK-Attn, a novel framework that enables LLMs to efficiently integrate real-time knowledge from a dynamic external source. Our approach synergizes an LLM with a dynamic Knowledge Graph (KG) that can be updated instantaneously. The core of our framework is a sparse knowledge attention mechanism, which allows the LLM to perform a coarse-to-fine grained search, efficiently identifying and focusing on a small, highly relevant subset of facts from the vast KG. This mechanism avoids the high computational cost of dense attention over the entire knowledge base and mitigates noise from irrelevant information. We demonstrate through extensive experiments on time-sensitive question-answering tasks that DySK-Attn significantly outperforms strong baselines, including standard Retrieval-Augmented Generation (RAG) and model editing techniques, in both factual accuracy for updated knowledge and computational efficiency. Our framework offers a scalable and effective solution for building LLMs that can stay current with the ever-changing world.

DySK-Attn: A Framework for Efficient, Real-Time Knowledge Updating in Large Language Models via Dynamic Sparse Knowledge Attention

TL;DR

DySK-Attn is a novel framework that enables LLMs to efficiently integrate real-time knowledge from a dynamic external source, and synergizes an LLM with a dynamic Knowledge Graph (KG) that can be updated instantaneously.

Abstract

Large Language Models (LLMs) suffer from a critical limitation: their knowledge is static and quickly becomes outdated. Retraining these massive models is computationally prohibitive, while existing knowledge editing techniques can be slow and may introduce unforeseen side effects. To address this, we propose DySK-Attn, a novel framework that enables LLMs to efficiently integrate real-time knowledge from a dynamic external source. Our approach synergizes an LLM with a dynamic Knowledge Graph (KG) that can be updated instantaneously. The core of our framework is a sparse knowledge attention mechanism, which allows the LLM to perform a coarse-to-fine grained search, efficiently identifying and focusing on a small, highly relevant subset of facts from the vast KG. This mechanism avoids the high computational cost of dense attention over the entire knowledge base and mitigates noise from irrelevant information. We demonstrate through extensive experiments on time-sensitive question-answering tasks that DySK-Attn significantly outperforms strong baselines, including standard Retrieval-Augmented Generation (RAG) and model editing techniques, in both factual accuracy for updated knowledge and computational efficiency. Our framework offers a scalable and effective solution for building LLMs that can stay current with the ever-changing world.

Paper Structure

This paper contains 31 sections, 8 equations, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: Detailed pipeline of DySK-Attn. A query triggers coarse retrieval to form a candidate subgraph $\mathcal{G}_{sub}$ via ANN over entity/text embeddings. Sparse Knowledge Attention selects top-$k$ facts to produce a knowledge vector, which is fused into the LLM with a learnable gate $\lambda$ across selected layers, yielding the final generation. The dynamic KG supports real-time updates via API, maintains RotatE embeddings, and performs periodic retraining.
  • Figure 2: Main results on TemporalWiki. DySK-Attn achieves the best F1 on both seen and unseen knowledge.
  • Figure 3: Ablation on unseen split. Removing sparse attention or the dynamic KG hurts performance most.
  • Figure 4: Efficiency comparison across models. Bars show inference latency (ms/token); annotations summarize knowledge update cost.