GenTKG: Generative Forecasting on Temporal Knowledge Graph with Large Language Models

Ruotong Liao; Xu Jia; Yangzhe Li; Yunpu Ma; Volker Tresp

GenTKG: Generative Forecasting on Temporal Knowledge Graph with Large Language Models

Ruotong Liao, Xu Jia, Yangzhe Li, Yunpu Ma, Volker Tresp

TL;DR

GenTKG introduces a retrieval-augmented generative framework for temporal knowledge graph forecasting that leverages a temporal logic rule-based retrieval phase to assemble relevant histories and a few-shot parameter-efficient instruction-tuning phase to align LLMs for autoregressive prediction. By reframing data-centric learning as task-centric alignment, GenTKG achieves state-of-the-art performance with exceptionally small training data (as few as 16 shots) and demonstrates strong cross-domain and in-domain generalization without retraining the LLM in phase two. The approach uses concrete mechanisms—TLR with first-order temporal logic, a rule bank, and LoRA-based fine-tuning within a prompt-driven generation setup—to map complex tKG structure into natural-language prompts that LLMs can effectively process. Empirical results on ICEWS14, ICEWS18, GDELT, and YAGO show substantial improvements over embedding-based, rule-based, and ICL baselines, underscoring the potential of generative forecasting for temporal relational data and offering a scalable, generalizable paradigm for tKG reasoning.

Abstract

The rapid advancements in large language models (LLMs) have ignited interest in the temporal knowledge graph (tKG) domain, where conventional embedding-based and rule-based methods dominate. The question remains open of whether pre-trained LLMs can understand structured temporal relational data and replace them as the foundation model for temporal relational forecasting. Therefore, we bring temporal knowledge forecasting into the generative setting. However, challenges occur in the huge chasms between complex temporal graph data structure and sequential natural expressions LLMs can handle, and between the enormous data sizes of tKGs and heavy computation costs of finetuning LLMs. To address these challenges, we propose a novel retrieval-augmented generation framework named GenTKG combining a temporal logical rule-based retrieval strategy and few-shot parameter-efficient instruction tuning to solve the above challenges, respectively. Extensive experiments have shown that GenTKG outperforms conventional methods of temporal relational forecasting with low computation resources using extremely limited training data as few as 16 samples. GenTKG also highlights remarkable cross-domain generalizability with outperforming performance on unseen datasets without re-training, and in-domain generalizability regardless of time split in the same dataset. Our work reveals the huge potential of LLMs in the tKG domain and opens a new frontier for generative forecasting on tKGs. Code and data are released here: https://github.com/mayhugotong/GenTKG.

GenTKG: Generative Forecasting on Temporal Knowledge Graph with Large Language Models

TL;DR

Abstract

Paper Structure (37 sections, 1 equation, 6 figures, 4 tables, 1 algorithm)

This paper contains 37 sections, 1 equation, 6 figures, 4 tables, 1 algorithm.

Introduction
Generative Forecasting on Temporal Knowledge Graph
Temporal Logic Rule-based Retrieval
Definition I (Temporal Random Walk)
Definition II (Temporal Logical Rule)
Rule Learning
Temporal Logic Rule-based Retrieval
Align LLM to Generative tKG Forecasting
Instruction Prompt Design
Parameter-efficient Instruction Tuning
Efficient Alignment with Few-shot Tuning
Generalization Ability of GenTKG
Experimental Setup
Experimental Results
Main Results
...and 22 more sections

Figures (6)

Figure 1: Framework of GenTKG. GenTKG first retrieves relevant facts based on a temporal logical rule-based retrieval strategy (TLR) then samples $K$ prompts for few-shot parameter-efficient instruction-tuning (FIT) that aligns LLM to the task of generative temporal knowledge graph forecasting.
Figure 2: Instruction Prompt Design
Figure 3: Cross-Domain Generalization Setting. (a) Single dataset evaluation. All training and evaluation is on GDELT except generalized GenTKG, which is trained on ICEWS14. (b) Cross-checking. We cross-check the trained LLaMA2 in GenTKG on different training datasets and evaluation datasets. The performance drop compared to the original training setting takes up only small percentages. Even higher performance than ICL can be observed. More discussions about experiment settings and analysis are given in Appendix \ref{['sss: ap4']}, explaining the huge relative difference on GDELT is due to its poor baseline performances.
Figure 4: In-domain generalizability. GenTKG exceeds conventional methods on all different partitions of training data on ICEWS14. Values in Appendix Table \ref{['tab:few-shot-ind']}.
Figure 5: (a) Both TLR and FIT phases contribute to GenTKG. (b) Increasing the few-shot training parameter $K$ improves performance.
...and 1 more figures

GenTKG: Generative Forecasting on Temporal Knowledge Graph with Large Language Models

TL;DR

Abstract

GenTKG: Generative Forecasting on Temporal Knowledge Graph with Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (6)