GenTKG: Generative Forecasting on Temporal Knowledge Graph with Large Language Models
Ruotong Liao, Xu Jia, Yangzhe Li, Yunpu Ma, Volker Tresp
TL;DR
GenTKG introduces a retrieval-augmented generative framework for temporal knowledge graph forecasting that leverages a temporal logic rule-based retrieval phase to assemble relevant histories and a few-shot parameter-efficient instruction-tuning phase to align LLMs for autoregressive prediction. By reframing data-centric learning as task-centric alignment, GenTKG achieves state-of-the-art performance with exceptionally small training data (as few as 16 shots) and demonstrates strong cross-domain and in-domain generalization without retraining the LLM in phase two. The approach uses concrete mechanisms—TLR with first-order temporal logic, a rule bank, and LoRA-based fine-tuning within a prompt-driven generation setup—to map complex tKG structure into natural-language prompts that LLMs can effectively process. Empirical results on ICEWS14, ICEWS18, GDELT, and YAGO show substantial improvements over embedding-based, rule-based, and ICL baselines, underscoring the potential of generative forecasting for temporal relational data and offering a scalable, generalizable paradigm for tKG reasoning.
Abstract
The rapid advancements in large language models (LLMs) have ignited interest in the temporal knowledge graph (tKG) domain, where conventional embedding-based and rule-based methods dominate. The question remains open of whether pre-trained LLMs can understand structured temporal relational data and replace them as the foundation model for temporal relational forecasting. Therefore, we bring temporal knowledge forecasting into the generative setting. However, challenges occur in the huge chasms between complex temporal graph data structure and sequential natural expressions LLMs can handle, and between the enormous data sizes of tKGs and heavy computation costs of finetuning LLMs. To address these challenges, we propose a novel retrieval-augmented generation framework named GenTKG combining a temporal logical rule-based retrieval strategy and few-shot parameter-efficient instruction tuning to solve the above challenges, respectively. Extensive experiments have shown that GenTKG outperforms conventional methods of temporal relational forecasting with low computation resources using extremely limited training data as few as 16 samples. GenTKG also highlights remarkable cross-domain generalizability with outperforming performance on unseen datasets without re-training, and in-domain generalizability regardless of time split in the same dataset. Our work reveals the huge potential of LLMs in the tKG domain and opens a new frontier for generative forecasting on tKGs. Code and data are released here: https://github.com/mayhugotong/GenTKG.
