A Representer Theorem for Hawkes Processes via Penalized Least Squares Minimization
Hideaki Kim, Tomoharu Iwata
TL;DR
This paper addresses nonparametric learning of triggering kernels in linear Hawkes processes by casting the problem into a reproducing kernel Hilbert space (RKHS) with a penalized least squares objective. A novel representer theorem is proved: the optimal triggering kernels $g_{ij}$ admit a linear expansion in terms of transformed kernels $h_j$, with all dual coefficients fixed to unity, and the $h_j$ are defined via a system of Fredholm integral equations. By employing a degenerate kernel approximation and random Fourier features, the authors derive closed-form expressions for the equivalent kernels and the estimators, eliminating the need for discretization and reducing computational complexity to a form dominated by a single matrix inversion whose size scales with $MU$ and the feature map dimension $M$. The method achieves competitive predictive accuracy on synthetic data while delivering substantial speedups over state-of-the-art kernel-based estimators, making it well-suited for large-scale event data. Limitations include that the linear-Hawkes setting does not guarantee non-negativity of the intensity, and the cubic scaling in the number of dimensions $U$ may motivate iterative solvers for very high-dimensional problems.
Abstract
The representer theorem is a cornerstone of kernel methods, which aim to estimate latent functions in reproducing kernel Hilbert spaces (RKHSs) in a nonparametric manner. Its significance lies in converting inherently infinite-dimensional optimization problems into finite-dimensional ones over dual coefficients, thereby enabling practical and computationally tractable algorithms. In this paper, we address the problem of estimating the latent triggering kernels--functions that encode the interaction structure between events--for linear multivariate Hawkes processes based on observed event sequences within an RKHS framework. We show that, under the principle of penalized least squares minimization, a novel form of representer theorem emerges: a family of transformed kernels can be defined via a system of simultaneous integral equations, and the optimal estimator of each triggering kernel is expressed as a linear combination of these transformed kernels evaluated at the data points. Remarkably, the dual coefficients are all analytically fixed to unity, obviating the need to solve a costly optimization problem to obtain the dual coefficients. This leads to a highly efficient estimator capable of handling large-scale data more effectively than conventional nonparametric approaches. Empirical evaluations on synthetic datasets reveal that the proposed method attains competitive predictive accuracy while substantially improving computational efficiency over existing state-of-the-art kernel method-based estimators.
