ESQA: Event Sequences Question Answering

Irina Abdullaeva; Andrei Filatov; Mikhail Orlov; Ivan Karpukhin; Viacheslav Vasilev; Denis Dimitrov; Andrey Kuznetsov; Ivan Kireev; Andrey Savchenko

ESQA: Event Sequences Question Answering

Irina Abdullaeva, Andrei Filatov, Mikhail Orlov, Ivan Karpukhin, Viacheslav Vasilev, Denis Dimitrov, Andrey Kuznetsov, Ivan Kireev, Andrey Savchenko

TL;DR

ESQA introduces Event Sequences Question Answering, a multimodal architecture that leverages a frozen FLAN-T5 LLM with parameter-efficient fine-tuning to model irregular time-stamped event sequences. It frames downstream tasks as natural language questions, uses a trainable event-embedding encoder, and connects an event sequence representation to the LLM through a Q-Former–based connector, enabling accurate extractive and predictive reasoning over long sequences without extensive fine-tuning. Empirical results across five public datasets show ESQA is competitive with or superior to strong baselines, particularly on categorical and temporal next-event predictions, and demonstrates notable zero-shot generalization to unseen tasks. Limitations include discretization-induced errors for numerical features and challenges with highly unbalanced or regression-heavy zero-shot settings, with future work targeting improved temporal processing and unbalanced-class handling.

Abstract

Event sequences (ESs) arise in many practical domains including finance, retail, social networks, and healthcare. In the context of machine learning, event sequences can be seen as a special type of tabular data with annotated timestamps. Despite the importance of ESs modeling and analysis, little effort was made in adapting large language models (LLMs) to the ESs domain. In this paper, we highlight the common difficulties of ESs processing and propose a novel solution capable of solving multiple downstream tasks with little or no finetuning. In particular, we solve the problem of working with long sequences and improve time and numeric features processing. The resulting method, called ESQA, effectively utilizes the power of LLMs and, according to extensive experiments, achieves state-of-the-art results in the ESs domain.

ESQA: Event Sequences Question Answering

TL;DR

Abstract

Paper Structure (24 sections, 5 equations, 3 figures, 9 tables)

This paper contains 24 sections, 5 equations, 3 figures, 9 tables.

Introduction
Background
Event Sequences Question Answering
Questions and answers construction
Events embeddings
Encoder
Connector
Language Model
Experiments
Experimental setup
Experimental results
Main results
Predictive tasks
Generalization abilities
Related work
...and 9 more sections

Figures (3)

Figure 1: Model architecture. The components of the approach that do not require training are colored in blue. Components whose weights are optimised during training are colored in orange. The trainable embeddings and associated tokens are colored in red.
Figure 2: a) Event sequences features encoding; in the example, there are $N$ numerical and $C$ categorical features, which are concatenated into a tensor $e_i^{emb}$ of dimension $dim(e_i^{emb})$. b) The event sequence encoder model processes the concatenated feature embedding vectors $S_n^{emb}$ for all events within a sequence, ultimately producing a comprehensive embedding $\tilde{S_n}^{emb}$ for the entire event sequence.
Figure 3: The Q-Former model's architecture is designed to extract the most relevant event sequence representations. It produces $q$ query embeddings for each event sequence, which are then linearly projected to the size of the language model embedding and appended to the embedded question tokens. Subsequently, the joint sequence is transmitted to the LLM.

ESQA: Event Sequences Question Answering

TL;DR

Abstract

ESQA: Event Sequences Question Answering

Authors

TL;DR

Abstract

Table of Contents

Figures (3)