Table of Contents
Fetching ...

Spectral clustering for dependent community Hawkes process models of temporal networks

Lingfei Zhao, Hadeel Soliman, Kevin S. Xu, Subhadeep Paul

TL;DR

This work develops a general Dependent Community Hawkes (DCH) framework that integrates stochastic block models with mutually exciting Hawkes processes to capture both community structure and dyadic dependence in temporal networks. It provides non-asymptotic spectral clustering guarantees on the count matrix, revealing how misclustering error scales with $n$, $K$, $T$, and the dependence parameter $\gamma_{\max}$, and shows consistency as $T\to\infty$. To balance flexibility and scalability, the authors introduce the Self and Reciprocal (SR) model within DCH and derive a consistent Generalized Method of Moments (GMM) estimator for its parameters under a restricted SR variant, complemented by a local refinement step to improve community assignments. The approach is validated through extensive simulations and real-data experiments, demonstrating competitive predictive performance and substantial computational efficiency compared to more flexible models like MULCH. Overall, the paper advances spectral clustering theory for dependent, weighted temporal networks and provides practical, scalable tools for joint community detection and Hawkes-parameter estimation.

Abstract

Temporal networks observed continuously over time through timestamped relational events data are commonly encountered in application settings including online social media communications, financial transactions, and international relations. Temporal networks often exhibit community structure and strong dependence patterns among node pairs. This dependence can be modeled through mutual excitations, where an interaction event from a sender to a receiver node increases the possibility of future events among other node pairs. We provide statistical results for a class of models that we call dependent community Hawkes (DCH) models, which combine the stochastic block model with mutually exciting Hawkes processes for modeling both community structure and dependence among node pairs, respectively. We derive a non-asymptotic upper bound on the misclustering error of spectral clustering on the event count matrix as a function of the number of nodes and communities, time duration, and the amount of dependence in the model. Our result leverages recent results on bounding an appropriate distance between a multivariate Hawkes process count vector and a Gaussian vector, along with results from random matrix theory. We also propose a DCH model that incorporates only self and reciprocal excitation along with highly scalable parameter estimation using a Generalized Method of Moments (GMM) estimator that we demonstrate to be consistent for growing network size and time duration.

Spectral clustering for dependent community Hawkes process models of temporal networks

TL;DR

This work develops a general Dependent Community Hawkes (DCH) framework that integrates stochastic block models with mutually exciting Hawkes processes to capture both community structure and dyadic dependence in temporal networks. It provides non-asymptotic spectral clustering guarantees on the count matrix, revealing how misclustering error scales with , , , and the dependence parameter , and shows consistency as . To balance flexibility and scalability, the authors introduce the Self and Reciprocal (SR) model within DCH and derive a consistent Generalized Method of Moments (GMM) estimator for its parameters under a restricted SR variant, complemented by a local refinement step to improve community assignments. The approach is validated through extensive simulations and real-data experiments, demonstrating competitive predictive performance and substantial computational efficiency compared to more flexible models like MULCH. Overall, the paper advances spectral clustering theory for dependent, weighted temporal networks and provides practical, scalable tools for joint community detection and Hawkes-parameter estimation.

Abstract

Temporal networks observed continuously over time through timestamped relational events data are commonly encountered in application settings including online social media communications, financial transactions, and international relations. Temporal networks often exhibit community structure and strong dependence patterns among node pairs. This dependence can be modeled through mutual excitations, where an interaction event from a sender to a receiver node increases the possibility of future events among other node pairs. We provide statistical results for a class of models that we call dependent community Hawkes (DCH) models, which combine the stochastic block model with mutually exciting Hawkes processes for modeling both community structure and dependence among node pairs, respectively. We derive a non-asymptotic upper bound on the misclustering error of spectral clustering on the event count matrix as a function of the number of nodes and communities, time duration, and the amount of dependence in the model. Our result leverages recent results on bounding an appropriate distance between a multivariate Hawkes process count vector and a Gaussian vector, along with results from random matrix theory. We also propose a DCH model that incorporates only self and reciprocal excitation along with highly scalable parameter estimation using a Generalized Method of Moments (GMM) estimator that we demonstrate to be consistent for growing network size and time duration.

Paper Structure

This paper contains 48 sections, 14 theorems, 96 equations, 6 figures, 8 tables, 2 algorithms.

Key Result

Proposition 1

Define the distance $d_2$ between two random vectors $X$ and $Y$ as where $\mathcal{H} = \{g \in \mathcal{C}^2(R^{n^2}): \|g\|_{Lip} \leq 1, M_2(g) \leq 1\}$. Let $n$ be a fixed quantity that does not change with $T$ and assume $\rho(\boldsymbol \Gamma) <1$. Define Let $G \sim N_{n^2}(0, \boldsymbol R \mathop{\mathrm{diag}}\nolimits (\boldsymbol R \operatorname{vec}(\boldsymbol \mu)) \boldsymbol

Figures (6)

  • Figure 1: An example of dependence in temporal networks: an event from A1 to B1 (solid arrow) triggers multiple possible future events (red and blue dashed arrows).
  • Figure 2: Heat map of adjusted Rand index of spectral clustering with varying $n$, $T$, and $K$, averaged over 15 simulated networks.
  • Figure 3: The spectral norm of error and the spectral clustering accuracy with different $\gamma_{\max}$ ($\pm$ standard error over 100 simulated networks). As $\gamma_{\max}$ increases, the spectral norm of the error increases superlinearly while the clustering accuracy decreases.
  • Figure 4: Averaged mean squared errors (MSEs) of GMM estimator for $\boldsymbol \mu$, $\boldsymbol \alpha^n$, $\boldsymbol \alpha^r$, and averaged MSEs of maximum likelihood estimator for $\boldsymbol \beta^n$, $\boldsymbol \beta^r$ ($\pm$ standard error over 10 runs). (\ref{['fig:MSE_T_mu']})-(\ref{['fig:MSE_T_b_r']}) Fixed $n=90$ while varying duration $T$. (\ref{['fig:MSE_N_mu']})-(\ref{['fig:MSE_N_b_r']}) Fixed $T=600$ while varying number of nodes $n$. The MSEs for all parameters decrease as $n$ or $T$ decreases.
  • Figure 5: (\ref{['fig:ref_T']})-(\ref{['fig:ref_n']}) Adjusted Rand index of spectral clustering and the refinement algorithm with varying $T$ and $n$, respectively ($\pm$ standard error over 10 simulated networks). (\ref{['fig:ref_T_time']})-(\ref{['fig:ref_n_time']}) Computation time of the spectral clustering + estimation time with and without refinement while varying $T$ and $n$, respectively.
  • ...and 1 more figures

Theorems & Definitions (14)

  • Proposition 1
  • Proposition 2
  • Theorem 3
  • Lemma 4
  • Theorem 5
  • Corollary 6
  • Lemma 7
  • Lemma 8
  • Theorem 9
  • Proposition 10
  • ...and 4 more