Cardinality-Regularized Hawkes-Granger Model

Tsuyoshi Idé; Georgios Kollias; Dzung T. Phan; Naoki Abe

Cardinality-Regularized Hawkes-Granger Model

Tsuyoshi Idé, Georgios Kollias, Dzung T. Phan, Naoki Abe

TL;DR

This work addresses the problem of learning sparse Granger causality in Hawkes processes, where conventional likelihood-based approaches suffer from a pathological singularity that prevents true sparsity. It introduces a cardinality-regularized MM framework, termed L0 Hawkes, that enforces sparsity via an $\\ell_0$ penalty on the triggering matrix $\\mathsf{A}$ and yields semi-analytic instance- and type-level causal diagnoses through instance triggering probabilities $q_{n,i}$. By combining MM with Jensen bounds and an $\\epsilon$-sparsity strategy, the method achieves robust, interpretable sparse graphs while maintaining tractable optimization with complexity $\\mathcal{O}(N^2+D^2)$. The authors validate the approach on synthetic benchmarks and real-world data from power grids and cloud data centers, showing that L0 Hawkes achieves higher break-even accuracy and sparser, more meaningful causal structures than neural Granger or conventional sparse Hawkes models. Overall, the paper provides a principled, scalable framework for precise instance-wise causality in event data, with practical impact for prioritizing and diagnosing failures in complex systems.

Abstract

We propose a new sparse Granger-causal learning framework for temporal event data. We focus on a specific class of point processes called the Hawkes process. We begin by pointing out that most of the existing sparse causal learning algorithms for the Hawkes process suffer from a singularity in maximum likelihood estimation. As a result, their sparse solutions can appear only as numerical artifacts. In this paper, we propose a mathematically well-defined sparse causal learning framework based on a cardinality-regularized Hawkes process, which remedies the pathological issues of existing approaches. We leverage the proposed algorithm for the task of instance-wise causal event analysis, where sparsity plays a critical role. We validate the proposed framework with two real use-cases, one from the power grid and the other from the cloud data center management domain.

Cardinality-Regularized Hawkes-Granger Model

TL;DR

penalty on the triggering matrix

and yields semi-analytic instance- and type-level causal diagnoses through instance triggering probabilities

. By combining MM with Jensen bounds and an

-sparsity strategy, the method achieves robust, interpretable sparse graphs while maintaining tractable optimization with complexity

. The authors validate the approach on synthetic benchmarks and real-world data from power grids and cloud data centers, showing that L0 Hawkes achieves higher break-even accuracy and sparser, more meaningful causal structures than neural Granger or conventional sparse Hawkes models. Overall, the paper provides a principled, scalable framework for precise instance-wise causality in event data, with practical impact for prioritizing and diagnosing failures in complex systems.

Abstract

Paper Structure (21 sections, 1 theorem, 38 equations, 9 figures, 1 table, 1 algorithm)

This paper contains 21 sections, 1 theorem, 38 equations, 9 figures, 1 table, 1 algorithm.

Introduction
Related work
Preliminaries
Problem setting
Likelihood and intensity function
Cardinality-Regularized Hawkes-Granger Model
Intensity function and Granger causality
Cardinality-regularized minorization-maximization framework
Sparse Causal Learning via Cardinality Regularization
Algorithm summary
Experiments
Concluding remarks
Acknowledgement
Solutions for Baseline Intensity and Decay Parameters
Experimental Details
...and 6 more sections

Key Result

Theorem 1

For $p \geq 1$, the problem $\max_{\bm{x}}\left\{ \sum_{m}\Psi_m(x_m)- \tau \| \bm{x}\|_p \right\}$ is convex and has a unique solution. Let $\bm{x}^{**}$ be the solution. The solution cannot be sparse, i.e., $x^{**}_m \neq 0$ for $\forall m$, if $g_m >0$.

Figures (9)

Figure 1: The Hawkes-Granger model allows two different levels of causal analysis: (a) instance-wise and (b) type-wise, in which well-defined sparsity is essential for causal diagnosis. (c) Example of five-variate point process data, where each '$|$' represents an event instance.
Figure 2: Illustration of Hawkes model in Eq. \ref{['eq:Hawkes-indensity-func-general']}, showing $\lambda_d(t\mid \mathcal{H}_4)$ as an example. The $d_1$- and $d_3$-types are not causally related to $d$.
Figure 3: Three cases in Eq. \ref{['eq:Psi-l0objective']}: (a) $x_m^* > \epsilon$ and (b) two possibilities when $x_m^* \leq \epsilon$.
Figure 4: TN (red) and TP (blue) accuracies as a function of log regularization strength.
Figure 5: Comparison of $\bm{x}^*$ (flattened $\mathsf{A}$ in each row) computed with 100 different $\tau$ values.
...and 4 more figures

Theorems & Definitions (2)

Definition 1: Hawkes process and Granger non-causality eichler2017graphical
Theorem 1

Cardinality-Regularized Hawkes-Granger Model

TL;DR

Abstract

Cardinality-Regularized Hawkes-Granger Model

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (2)