Table of Contents
Fetching ...

KaLM: Knowledge-aligned Autoregressive Language Modeling via Dual-view Knowledge Graph Contrastive Learning

Peng Yu, Cheng Deng, Beiya Dai, Xinbing Wang, Ying Wen

TL;DR

KaLM tackles knowledge deficiencies in autoregressive LLMs by jointly aligning them with knowledge graphs through explicit dual-view contrastive learning and implicit triple-completion language modeling. The final objective, $\mathcal{L}_{KaLM} = \mathcal{L}_{exp} + \lambda \cdot \mathcal{L}_{imp}$, enables robust knowledge representation while preserving generation. Theoretical results show that dual-view contrastive learning promotes knowledge alignment and mitigates representation anisotropy, and experiments demonstrate significant gains in embedding-based KG completion and generation-based KGQA across multiple LLMs when fine-tuned with KaLM. The approach yields more discriminative, uniform knowledge representations and improves KG reasoning without sacrificing language modeling capabilities, suggesting practical benefits for retrieval-augmented and cross-domain knowledge tasks.

Abstract

Autoregressive large language models (LLMs) pre-trained by next token prediction are inherently proficient in generative tasks. However, their performance on knowledge-driven tasks such as factual knowledge querying remains unsatisfactory. Knowledge graphs (KGs), as high-quality structured knowledge bases, can provide reliable knowledge for LLMs, potentially compensating for their knowledge deficiencies. Aligning LLMs with explicit, structured knowledge from KGs has been a challenge; previous attempts either failed to effectively align knowledge representations or compromised the generative capabilities of LLMs, leading to less-than-optimal outcomes. This paper proposes \textbf{KaLM}, a \textit{Knowledge-aligned Language Modeling} approach, which fine-tunes autoregressive LLMs to align with KG knowledge via the joint objective of explicit knowledge alignment and implicit knowledge alignment. The explicit knowledge alignment objective aims to directly optimize the knowledge representation of LLMs through dual-view knowledge graph contrastive learning. The implicit knowledge alignment objective focuses on incorporating textual patterns of knowledge into LLMs through triple completion language modeling. Notably, our method achieves a significant performance boost in evaluations of knowledge-driven tasks, specifically embedding-based knowledge graph completion and generation-based knowledge graph question answering.

KaLM: Knowledge-aligned Autoregressive Language Modeling via Dual-view Knowledge Graph Contrastive Learning

TL;DR

KaLM tackles knowledge deficiencies in autoregressive LLMs by jointly aligning them with knowledge graphs through explicit dual-view contrastive learning and implicit triple-completion language modeling. The final objective, , enables robust knowledge representation while preserving generation. Theoretical results show that dual-view contrastive learning promotes knowledge alignment and mitigates representation anisotropy, and experiments demonstrate significant gains in embedding-based KG completion and generation-based KGQA across multiple LLMs when fine-tuned with KaLM. The approach yields more discriminative, uniform knowledge representations and improves KG reasoning without sacrificing language modeling capabilities, suggesting practical benefits for retrieval-augmented and cross-domain knowledge tasks.

Abstract

Autoregressive large language models (LLMs) pre-trained by next token prediction are inherently proficient in generative tasks. However, their performance on knowledge-driven tasks such as factual knowledge querying remains unsatisfactory. Knowledge graphs (KGs), as high-quality structured knowledge bases, can provide reliable knowledge for LLMs, potentially compensating for their knowledge deficiencies. Aligning LLMs with explicit, structured knowledge from KGs has been a challenge; previous attempts either failed to effectively align knowledge representations or compromised the generative capabilities of LLMs, leading to less-than-optimal outcomes. This paper proposes \textbf{KaLM}, a \textit{Knowledge-aligned Language Modeling} approach, which fine-tunes autoregressive LLMs to align with KG knowledge via the joint objective of explicit knowledge alignment and implicit knowledge alignment. The explicit knowledge alignment objective aims to directly optimize the knowledge representation of LLMs through dual-view knowledge graph contrastive learning. The implicit knowledge alignment objective focuses on incorporating textual patterns of knowledge into LLMs through triple completion language modeling. Notably, our method achieves a significant performance boost in evaluations of knowledge-driven tasks, specifically embedding-based knowledge graph completion and generation-based knowledge graph question answering.

Paper Structure

This paper contains 37 sections, 2 theorems, 22 equations, 10 figures, 8 tables.

Key Result

Theorem 1

For temperature $\tau > 0$, as the number of negative samples $\mathcal{N} \rightarrow \infty$, the normalized dual-view knowledge graph contrastive loss in Equation eq:exp-reform converges to We have the following conclusions:

Figures (10)

  • Figure 1: Similarity matrix of knowledge representations of (a) Llama-2-7B touvron2023llama and (b) KaLM. The values denote the cosine similarity between the head-relation and tail embedding. The diagonal elements represent positive <head-relation, tail> pairs from the same KG triple, which should maintain high similarity (darker color); off-diagonal elements represent negative <head-relation, tail> pairs from different KG triples, which should have lower similarity (lighter color). In an ideal setting, knowledge representations should be able to distinguish between different triples, while maintaining alignment and uniformity of the representation, as shown in Figure \ref{['fig-intro-b']}.
  • Figure 2: The overall framework of KaLM. Up: The explicit knowledge alignment objective ($\mathcal{L}_{exp}$) aims to directly optimize the knowledge representation of LLMs via dual-view knowledge graph contrastive learning. Down: The implicit knowledge alignment objective ($\mathcal{L}_{imp}$) focuses on incorporating textual patterns of knowledge into LLMs via triple completion language modeling. The final training objective is the weighted average of $\mathcal{L}_{exp}$ and $\mathcal{L}_{imp}$.
  • Figure 3: Comparison of generative knowledge inference performance between Llama-2-7B and KaLM. $\uparrow$ means higher is better and $\downarrow$ means lower is better.
  • Figure 4: Similarity matrix on the Wikitext-103 test set. From top-left to bottom-right, element $(i, j)$ denotes the cosine similarity between the $i$-th and the $j$-th sentence.
  • Figure 5: Case studies of Llama-2-7B and KaLM on KGQA tasks. Note that the head entity, relation, and tail entity are denoted by different colors. The $🗹$ mark indicates the correct answer, while $🗵$ signifies an incorrect answer.
  • ...and 5 more figures

Theorems & Definitions (4)

  • Theorem 1: Asymptotics of $\mathcal{L}_{\texttt{exp}}$
  • proof
  • Theorem 2: Alleviation of Anisotropy
  • proof