Table of Contents
Fetching ...

An Empirical Study: Extensive Deep Temporal Point Process

Haitao Lin, Cheng Tan, Lirong Wu, Zhangyang Gao, Zicheng Liu, Stan. Z. Li

TL;DR

This paper addresses the challenge of modeling asynchronous event sequences with deep temporal point processes by proposing EDTPP, a modular framework that separates history encoding, conditional intensity, relational discovery, and learning strategies. It extends history encoders (recurrent, attention, and Fourier-based) and CIF families (including neural, mixture, and flow-based approaches) and introduces a variational framework for discovering Granger causality graphs among event types. Through extensive experiments on MOOC and Stack Overflow, the authors show that latent graph discovery can improve both goodness-of-fit and predictive performance while providing interpretable relations among event types. The work advances interpretability in deep temporal point processes and outlines future directions for capturing long-range dependencies, improving type-wise modeling, and applying these methods to spatio-temporal and real-world datasets.

Abstract

Temporal point process as the stochastic process on continuous domain of time is commonly used to model the asynchronous event sequence featuring with occurrence timestamps. Thanks to the strong expressivity of deep neural networks, they are emerging as a promising choice for capturing the patterns in asynchronous sequences, in the context of temporal point process. In this paper, we first review recent research emphasis and difficulties in modeling asynchronous event sequences with deep temporal point process, which can be concluded into four fields: encoding of history sequence, formulation of conditional intensity function, relational discovery of events and learning approaches for optimization. We introduce most of recently proposed models by dismantling them into the four parts, and conduct experiments by remodularizing the first three parts with the same learning strategy for a fair empirical evaluation. Besides, we extend the history encoders and conditional intensity function family, and propose a Granger causality discovery framework for exploiting the relations among multi-types of events. Because the Granger causality can be represented by the Granger causality graph, discrete graph structure learning in the framework of Variational Inference is employed to reveal latent structures of the graph. Further experiments show that the proposed framework with latent graph discovery can both capture the relations and achieve an improved fitting and predicting performance.

An Empirical Study: Extensive Deep Temporal Point Process

TL;DR

This paper addresses the challenge of modeling asynchronous event sequences with deep temporal point processes by proposing EDTPP, a modular framework that separates history encoding, conditional intensity, relational discovery, and learning strategies. It extends history encoders (recurrent, attention, and Fourier-based) and CIF families (including neural, mixture, and flow-based approaches) and introduces a variational framework for discovering Granger causality graphs among event types. Through extensive experiments on MOOC and Stack Overflow, the authors show that latent graph discovery can improve both goodness-of-fit and predictive performance while providing interpretable relations among event types. The work advances interpretability in deep temporal point processes and outlines future directions for capturing long-range dependencies, improving type-wise modeling, and applying these methods to spatio-temporal and real-world datasets.

Abstract

Temporal point process as the stochastic process on continuous domain of time is commonly used to model the asynchronous event sequence featuring with occurrence timestamps. Thanks to the strong expressivity of deep neural networks, they are emerging as a promising choice for capturing the patterns in asynchronous sequences, in the context of temporal point process. In this paper, we first review recent research emphasis and difficulties in modeling asynchronous event sequences with deep temporal point process, which can be concluded into four fields: encoding of history sequence, formulation of conditional intensity function, relational discovery of events and learning approaches for optimization. We introduce most of recently proposed models by dismantling them into the four parts, and conduct experiments by remodularizing the first three parts with the same learning strategy for a fair empirical evaluation. Besides, we extend the history encoders and conditional intensity function family, and propose a Granger causality discovery framework for exploiting the relations among multi-types of events. Because the Granger causality can be represented by the Granger causality graph, discrete graph structure learning in the framework of Variational Inference is employed to reveal latent structures of the graph. Further experiments show that the proposed framework with latent graph discovery can both capture the relations and achieve an improved fitting and predicting performance.

Paper Structure

This paper contains 45 sections, 1 theorem, 61 equations, 8 figures, 8 tables.

Key Result

Theorem 1

[Universal Approximation Theorem of Mixture (Theorem 33.2 in Asymptotictheory).] Let $p(x)$ be a continuous density on $\mathbb{R}$. If $q(x)$ is any density on $\mathbb{R}$ and is also continous, then given $\epsilon >0$, and a compact set $\mathcal{S} \in \mathbb{R}$, there exist number of compone

Figures (8)

  • Figure 1: The workflows of deep temporal point process are divided into the four parts: encoding of history sequence, relational discovery of events, formulation of conditional intensity function and learning approaches.
  • Figure 2: An example of Granger causality graph representing the Granger causality of events. Events of type 1 are affected by type 3 and itself according to the Granger causality graph, so the CIF of type 1 events before $t_5$ is augmented only when type 1 and 3 events happen.
  • Figure 3: The dismantled four leading parts of EDTPP. In the final row, the box in red means that it is implemented in our code for fair empirical study, while the box in orange means that it will be further added for completeness.
  • Figure 4: An illustration of the framework: The sequences of three different event types after padding are represented by $\{\bm{Z}_m\}_{1\leq m \leq 3}$. After $g_e(\cdot)$, the discrete distribution of each element in $\bm{A} \in \mathbb{R}^{3 \times 3}$ is formulated. The sampled adjacency matrix determines the message passing from intra-type history encoding to the type-wise CIF.
  • Figure 5: An example of intra-type history encoding and type-wise intensity: The whole sequence is firstly split into multivariate series according to their event types. The history encoder operates on each series to obtain intra-type history encoding. The mask generated by latent graph structure is used to govern the message passing process from intra-type history encoding to type-wise CIFs.
  • ...and 3 more figures

Theorems & Definitions (1)

  • Theorem 1