Table of Contents
Fetching ...

Large Language Models as Interpolated and Extrapolated Event Predictors

Libo Zhang, Yue Ning

TL;DR

LEAP reframes sociopolitical event prediction as language-understanding and reasoning by leveraging quintuple-based data and fine-tuned language models. It combines two OP pathways—a ranking-based encoder-structure with ConvTransE and a QA-based generative approach using FLAN-T5BASE—and a MEF pathway that uses RoBERTaLARGE encodings with self-attention to predict future relations. Across ICEWS DVN/28075_2015 datasets, LEAPOP1, LEAPOP2, and LEAPMEF achieve strong accuracy and recall improvements over baselines, with open-source LLMs offering favorable cost-performance trade-offs relative to commercial APIs. Limitations include dataset scope and the need for retrieval-augmented or chain-of-thought prompting to further enhance temporal reasoning, suggesting a clear direction for future work.

Abstract

Salient facts of sociopolitical events are distilled into quadruples following a format of subject, relation, object, and timestamp. Machine learning methods, such as graph neural networks (GNNs) and recurrent neural networks (RNNs), have been built to make predictions and infer relations on the quadruple-based knowledge graphs (KGs). In many applications, quadruples are extended to quintuples with auxiliary attributes such as text summaries that describe the quadruple events. In this paper, we comprehensively investigate how large language models (LLMs) streamline the design of event prediction frameworks using quadruple-based or quintuple-based data while maintaining competitive accuracy. We propose LEAP, a unified framework that leverages large language models as event predictors. Specifically, we develop multiple prompt templates to frame the object prediction (OP) task as a standard question-answering (QA) task, suitable for instruction fine-tuning with an encoder-decoder LLM. For multi-event forecasting (MEF) task, we design a simple yet effective prompt template for each event quintuple. This novel approach removes the need for GNNs and RNNs, instead utilizing an encoder-only LLM to generate fixed intermediate embeddings, which are processed by a customized downstream head with a self-attention mechanism to predict potential relation occurrences in the future. Extensive experiments on multiple real-world datasets using various evaluation metrics validate the effectiveness of our approach.

Large Language Models as Interpolated and Extrapolated Event Predictors

TL;DR

LEAP reframes sociopolitical event prediction as language-understanding and reasoning by leveraging quintuple-based data and fine-tuned language models. It combines two OP pathways—a ranking-based encoder-structure with ConvTransE and a QA-based generative approach using FLAN-T5BASE—and a MEF pathway that uses RoBERTaLARGE encodings with self-attention to predict future relations. Across ICEWS DVN/28075_2015 datasets, LEAPOP1, LEAPOP2, and LEAPMEF achieve strong accuracy and recall improvements over baselines, with open-source LLMs offering favorable cost-performance trade-offs relative to commercial APIs. Limitations include dataset scope and the need for retrieval-augmented or chain-of-thought prompting to further enhance temporal reasoning, suggesting a clear direction for future work.

Abstract

Salient facts of sociopolitical events are distilled into quadruples following a format of subject, relation, object, and timestamp. Machine learning methods, such as graph neural networks (GNNs) and recurrent neural networks (RNNs), have been built to make predictions and infer relations on the quadruple-based knowledge graphs (KGs). In many applications, quadruples are extended to quintuples with auxiliary attributes such as text summaries that describe the quadruple events. In this paper, we comprehensively investigate how large language models (LLMs) streamline the design of event prediction frameworks using quadruple-based or quintuple-based data while maintaining competitive accuracy. We propose LEAP, a unified framework that leverages large language models as event predictors. Specifically, we develop multiple prompt templates to frame the object prediction (OP) task as a standard question-answering (QA) task, suitable for instruction fine-tuning with an encoder-decoder LLM. For multi-event forecasting (MEF) task, we design a simple yet effective prompt template for each event quintuple. This novel approach removes the need for GNNs and RNNs, instead utilizing an encoder-only LLM to generate fixed intermediate embeddings, which are processed by a customized downstream head with a self-attention mechanism to predict potential relation occurrences in the future. Extensive experiments on multiple real-world datasets using various evaluation metrics validate the effectiveness of our approach.
Paper Structure (17 sections, 6 equations, 3 figures, 11 tables)

This paper contains 17 sections, 6 equations, 3 figures, 11 tables.

Figures (3)

  • Figure 1: An overview of LEAPOP1 for ranking object prediction. RGCN takes historical TKGs to update entity embeddings, while GRU updates relation embeddings following sequential timestamps. A fine-tuned RoBERTaBASE encodes text summaries and outputs sentence embeddings after mean pooling. These embeddings, along with the manually located query, are fed into the ConvTransE decoder to rank all object candidates.
  • Figure 2: An overview of LEAPOP2 for generative object prediction. Historical quintuples are concatenated as in-context learning examples during prompt engineering, and a generative LLM, either FLAN-T5BASE or GPT-3.5-Turbo-Instruct, is instructed to generate a textual prediction for the missing object entity in query.
  • Figure 3: An overview of LEAPMEF for multi-event forecasting. Historical quintuples are fed into the simple prompt template and the pre-trained RoBERTaLARGE encoder one by one, resulting in multiple quintuple-level embeddings, which are aggregated through the self-attention mechanism. Eventually, a fully connected layer with element-wise $\text{Sigmoid}(\cdot)$ activation and a threshold of 0.5 for predictions of relation occurrences in the future.