Pretext Training Algorithms for Event Sequence Data
Yimu Wang, He Zhao, Ruizhi Deng, Frederick Tung, Greg Mori
TL;DR
This work addresses learning from unlabeled event sequence data by introducing a self-supervised framework with three complementary pretext tasks: masked reconstruction with density-preserving masking, contrastive learning using multiple views of event sequences, and alignment verification to enforce correct time–type coupling. The methods are architecture-agnostic and applicable to downstream tasks such as next-event prediction in temporal point processes, sequence-level classification, and missing-event interpolation. Empirical results on StackOverflow, MIMIC-II, Mooc, and Reddit demonstrate that pretext training improves NLL, RMSE, accuracy, and AUC across tasks, with ablations confirming the three tasks' complementary benefits. The study also compares zero-shot LLM predictions for time and type, finding LLMs competitive for timing yet weaker for event-type assignment, underscoring the value of specialized pretraining for event sequences and suggesting avenues for few-shot and synthetic-data scaling.
Abstract
Pretext training followed by task-specific fine-tuning has been a successful approach in vision and language domains. This paper proposes a self-supervised pretext training framework tailored to event sequence data. We introduce a novel alignment verification task that is specialized to event sequences, building on good practices in masked reconstruction and contrastive learning. Our pretext tasks unlock foundational representations that are generalizable across different down-stream tasks, including next-event prediction for temporal point process models, event sequence classification, and missing event interpolation. Experiments on popular public benchmarks demonstrate the potential of the proposed method across different tasks and data domains.
