Abacus: Self-Supervised Event Counting-Aligned Distributional Pretraining for Sequential User Modeling
Sullivan Castro, Artem Betlei, Thomas Di Martino, Nadir El Manouzi
TL;DR
This work tackles the sparsity and irregular timing of purchase events in display advertising by introducing Abacus, a counting-aligned self-supervised pretraining objective that predicts the empirical distribution of event types. It further proposes a hybrid multitask framework that combines Abacus with Masked Sequence Modeling and Barlow Twins to fuse stable counting statistics with sequence-sensitive learning. Across Taobao and a private dataset, Abacus-based pretraining improves downstream AUC and accelerates convergence, with the mixed-task hybrid achieving the strongest gains. The findings highlight the value of distributional pretraining for sequential user modeling and point to directions for scaling to longer sequences and more efficient time encoding.
Abstract
Modeling user purchase behavior is a critical challenge in display advertising systems, necessary for real-time bidding. The difficulty arises from the sparsity of positive user events and the stochasticity of user actions, leading to severe class imbalance and irregular event timing. Predictive systems usually rely on hand-crafted "counter" features, overlooking the fine-grained temporal evolution of user intent. Meanwhile, current sequential models extract direct sequential signal, missing useful event-counting statistics. We enhance deep sequential models with self-supervised pretraining strategies for display advertising. Especially, we introduce Abacus, a novel approach of predicting the empirical frequency distribution of user events. We further propose a hybrid objective unifying Abacus with sequential learning objectives, combining stability of aggregated statistics with the sequence modeling sensitivity. Experiments on two real-world datasets show that Abacus pretraining outperforms existing methods accelerating downstream task convergence, while hybrid approach yields up to +6.1% AUC compared to the baselines.
