Towards Unified Approaches in Self-Supervised Event Stream Modeling: Progress and Prospects
Levente Zólyomi, Tianze Wang, Sofiane Ennadir, Oleg Smirnov, Lele Cao
TL;DR
This survey addresses the fragmentation of SSL methods for event streams by proposing a unified, cross-domain perspective that treats ES as a common structural object across healthcare, finance, gaming, and e-commerce. It categorizes SSL approaches into predictive (masked modeling, autoregressive, temporal point processes) and contrastive (instance-based, distillation, decorrelation, multimodal), and discusses their domain-specific adaptations, strengths, and limitations. The authors highlight the need for domain-agnostic learning, richer benchmarks, and the integration of timing and multimodal information, arguing that future work should combine paradigms and leverage foundation-model ideas to unlock scalable, generalizable ES representations. By synthesizing datasets, downstream tasks, and benchmarking practices, the paper provides a roadmap for reproducible evaluation and cross-domain progress, with practical implications for improved decision-making in real-world ES-driven systems.
Abstract
The proliferation of digital interactions across diverse domains, such as healthcare, e-commerce, gaming, and finance, has resulted in the generation of vast volumes of event stream (ES) data. ES data comprises continuous sequences of timestamped events that encapsulate detailed contextual information relevant to each domain. While ES data holds significant potential for extracting actionable insights and enhancing decision-making, its effective utilization is hindered by challenges such as the scarcity of labeled data and the fragmented nature of existing research efforts. Self-Supervised Learning (SSL) has emerged as a promising paradigm to address these challenges by enabling the extraction of meaningful representations from unlabeled ES data. In this survey, we systematically review and synthesize SSL methodologies tailored for ES modeling across multiple domains, bridging the gaps between domain-specific approaches that have traditionally operated in isolation. We present a comprehensive taxonomy of SSL techniques, encompassing both predictive and contrastive paradigms, and analyze their applicability and effectiveness within different application contexts. Furthermore, we identify critical gaps in current research and propose a future research agenda aimed at developing scalable, domain-agnostic SSL frameworks for ES modeling. By unifying disparate research efforts and highlighting cross-domain synergies, this survey aims to accelerate innovation, improve reproducibility, and expand the applicability of SSL to diverse real-world ES challenges.
