Cumulative Distribution Function based General Temporal Point Processes

Maolin Wang; Yu Pan; Zenglin Xu; Ruocheng Guo; Xiangyu Zhao; Wanyu Wang; Yiqi Wang; Zitao Liu; Langming Liu

Cumulative Distribution Function based General Temporal Point Processes

Maolin Wang, Yu Pan, Zenglin Xu, Ruocheng Guo, Xiangyu Zhao, Wanyu Wang, Yiqi Wang, Zitao Liu, Langming Liu

TL;DR

This work tackles the challenge of forecasting events in temporal sequences by moving from intensity-based to CDF-based modeling. The authors introduce CuFun, a Temporal Point Process that uses a monotonic neural network to represent the CDF $F^{(\tau)}$, with past-event information acting as a scaling factor to predict future events. By deriving the density from the CDF via automatic differentiation, CuFun achieves stable log-likelihood optimization without explicit integral computations, and it demonstrates superior performance on both synthetic and real-world datasets, including long-range and high-frequency event patterns. The approach improves applicability to recommendation and information retrieval tasks by more accurately capturing complex temporal dynamics and rare/extreme events. CuFun's contributions include a novel CDF-based TPP, a principled fusion of history through scaling, and extensive empirical validation across diverse domains.

Abstract

Temporal Point Processes (TPPs) hold a pivotal role in modeling event sequences across diverse domains, including social networking and e-commerce, and have significantly contributed to the advancement of recommendation systems and information retrieval strategies. Through the analysis of events such as user interactions and transactions, TPPs offer valuable insights into behavioral patterns, facilitating the prediction of future trends. However, accurately forecasting future events remains a formidable challenge due to the intricate nature of these patterns. The integration of Neural Networks with TPPs has ushered in the development of advanced deep TPP models. While these models excel at processing complex and nonlinear temporal data, they encounter limitations in modeling intensity functions, grapple with computational complexities in integral computations, and struggle to capture long-range temporal dependencies effectively. In this study, we introduce the CuFun model, representing a novel approach to TPPs that revolves around the Cumulative Distribution Function (CDF). CuFun stands out by uniquely employing a monotonic neural network for CDF representation, utilizing past events as a scaling factor. This innovation significantly bolsters the model's adaptability and precision across a wide range of data scenarios. Our approach addresses several critical issues inherent in traditional TPP modeling: it simplifies log-likelihood calculations, extends applicability beyond predefined density function forms, and adeptly captures long-range temporal patterns. Our contributions encompass the introduction of a pioneering CDF-based TPP model, the development of a methodology for incorporating past event information into future event prediction, and empirical validation of CuFun's effectiveness through extensive experimentation on synthetic and real-world datasets.

Cumulative Distribution Function based General Temporal Point Processes

TL;DR

, with past-event information acting as a scaling factor to predict future events. By deriving the density from the CDF via automatic differentiation, CuFun achieves stable log-likelihood optimization without explicit integral computations, and it demonstrates superior performance on both synthetic and real-world datasets, including long-range and high-frequency event patterns. The approach improves applicability to recommendation and information retrieval tasks by more accurately capturing complex temporal dynamics and rare/extreme events. CuFun's contributions include a novel CDF-based TPP, a principled fusion of history through scaling, and extensive empirical validation across diverse domains.

Abstract

Paper Structure (26 sections, 16 equations, 8 figures)

This paper contains 26 sections, 16 equations, 8 figures.

Introduction
Background
Temporal Point Processes Modeling
Predicting Time of Next Event
Methodology
Relationships among Functions in TPP
Parameterizing CDF for TPP
Loss Function
Experiment
Dataset Description
Synthetic Datasets
Real-world Datasets
Baselines
Implementation Details
NLL Comparison on Synthetic Datasets
...and 11 more sections

Figures (8)

Figure 1: In Temporal Point Processes (TPPs), the Cumulative Distribution Function (CDF) $F^{(\tau)}$ is key. Derived from it are the Survivor Function $S^{(\tau)} = 1-F^{(\tau)}$, indicating the probability of no event by time $\tau$, and the Intensity Function $\lambda^{(t)} = p^{(\tau)}/S^{(\tau)}$, representing the event rate at time $t$. The Hazard Function $\phi^{(\tau)}$, equating to $\lambda^{(t)}$, reflects the immediate event occurrence risk. These relationships illustrate that deciphering the CDF enables inferring all critical TPP functions, highlighting their interconnectedness in deciphering temporal dynamics.
Figure 2: Model architecture. The $t_{i}$ is the time when a event happened. The time interval is denoted as $\tau_{i}$ and $\tau_{i}=t_{i}-t_{i-1}$. The density function, CDF, survivor function are denoted as $p^*$, $F^*$ and $S^*$ respectively, where $*$ symbol reminds us of dependence on past events. Past events time intervals are fed into a RNN and return the hidden vector $\boldsymbol{h}$. The future time interval $\tau$ and the hidden vector $\boldsymbol{h}$, are fed into the monotonic neural network (MNN). The output of the MNN corresponds to the CDF. Then we can further get the density function via automatic differentiation.
Figure 3: Details of our Monotonic Neural Network (MNN). All connections within the MNN are constrained to be positive, ensuring that the derivative of the CDF with respect to $\tau$ is positive. The activation function for the final unit is a $sigmoid$ function, ensuring that the output value falls within the interval (0,1). These two constraints collectively guarantee that the output is a valid CDF. The piece-wise multiplication operation represents the scaling effect of past events on the variable $\tau$.
Figure 4: NLL comparison on synthetic datasets. Lower score is better. All the improvements of our CuFun are statistically significant (i.e., two-sided t-test with $p$ < 0.05) over baselines).
Figure 5: NLL comparison on real-world datasets. Each score is obtained by subtracting the absolute score of our model. Lower score is better. These results collectively illustrate the robustness and versatility of the CuFun model affirming its advantages in handling various types of temporal point process data. All the improvements(except Wikipedia and Music) of our CuFun are statistically significant (i.e., two-sided t-test with $p$ < 0.05) over baselines).
...and 3 more figures

Cumulative Distribution Function based General Temporal Point Processes

TL;DR

Abstract

Cumulative Distribution Function based General Temporal Point Processes

Authors

TL;DR

Abstract

Table of Contents

Figures (8)