Table of Contents
Fetching ...

A Bayesian Mixture Model of Temporal Point Processes with Determinantal Point Process Prior

Yiwei Dong, Shaoxin Ye, Yuwen Cao, Qiyu Han, Hongteng Xu, Hanfang Yang

TL;DR

This work proposes a Bayesian mixture model of Temporal Point Processes with Determinantal Point Process Prior (TP 2) and accordingly an efficient posterior inference algorithm based on conditional Gibbs sampling and suggests that this framework could produce moderately fewer yet more diverse mixture components, and achieve outstanding results across multiple evaluation metrics.

Abstract

Asynchronous event sequence clustering aims to group similar event sequences in an unsupervised manner. Mixture models of temporal point processes have been proposed to solve this problem, but they often suffer from overfitting, leading to excessive cluster generation with a lack of diversity. To overcome these limitations, we propose a Bayesian mixture model of Temporal Point Processes with Determinantal Point Process prior (TP$^2$DP$^2$) and accordingly an efficient posterior inference algorithm based on conditional Gibbs sampling. Our work provides a flexible learning framework for event sequence clustering, enabling automatic identification of the potential number of clusters and accurate grouping of sequences with similar features. It is applicable to a wide range of parametric temporal point processes, including neural network-based models. Experimental results on both synthetic and real-world data suggest that our framework could produce moderately fewer yet more diverse mixture components, and achieve outstanding results across multiple evaluation metrics.

A Bayesian Mixture Model of Temporal Point Processes with Determinantal Point Process Prior

TL;DR

This work proposes a Bayesian mixture model of Temporal Point Processes with Determinantal Point Process Prior (TP 2) and accordingly an efficient posterior inference algorithm based on conditional Gibbs sampling and suggests that this framework could produce moderately fewer yet more diverse mixture components, and achieve outstanding results across multiple evaluation metrics.

Abstract

Asynchronous event sequence clustering aims to group similar event sequences in an unsupervised manner. Mixture models of temporal point processes have been proposed to solve this problem, but they often suffer from overfitting, leading to excessive cluster generation with a lack of diversity. To overcome these limitations, we propose a Bayesian mixture model of Temporal Point Processes with Determinantal Point Process prior (TPDP) and accordingly an efficient posterior inference algorithm based on conditional Gibbs sampling. Our work provides a flexible learning framework for event sequence clustering, enabling automatic identification of the potential number of clusters and accurate grouping of sequences with similar features. It is applicable to a wide range of parametric temporal point processes, including neural network-based models. Experimental results on both synthetic and real-world data suggest that our framework could produce moderately fewer yet more diverse mixture components, and achieve outstanding results across multiple evaluation metrics.

Paper Structure

This paper contains 28 sections, 29 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: The scheme of our TP$^2$DP$^2$. Some model parameters (i.e., $\bm{\Theta}_S$) apply Bayesian inference while the remaining ones (i.e., $\bm{\Theta}_R$) apply maximum likelihood estimation. We split $\bm{\Theta}_S$ into central parameters that determine clustering structures and non-central parameters, respectively. The DPP prior of the central parameters encourages diverse clusters without predefining the number of clusters.
  • Figure 2: The t-SNE plots van2008visualizing of RMTPP's event sequence embeddings du2016recurrent for a synthetic dataset with three clusters. NTPP-MIX zhang2022learning (left) produces four clusters wrongly, while our TP$^2$DP$^{2}$ (right) leads to the clustering results matching well with the ground truth.
  • Figure 3: The means and standard deviations of clustering purity obtained by DMHP and TP$^2$DP$^2$ with different $\delta$. The left panel is the result when the ground truth cluster number $K_{GT} = 4$, and the right is the result of $K_{GT} = 5$.
  • Figure 4: The t-SNE plot of the ground truth distribution for the synthetic mixture of Hawkes processes datasets with $\delta$ values of 0.2 (upper left), 0.4 (upper right), 0.6 (lower left), and 0.8 (lower right).
  • Figure 5: The base intensity $\bm\mu$ of first two clusters learned by two methods across 5 random trials. The dotted line represents the ground truth $\bm\mu$ in two clusters.
  • ...and 3 more figures