Sublinear Time Low-Rank Approximation of Hankel Matrices
Michael Kapralov, Cameron Musco, Kshiteej Sheth
TL;DR
This work proves that PSD Hankel matrices admit sublinear-time, structure-preserving low-rank approximations in the Frobenius norm with rank $O(\log n\log(1/\epsilon))$, matching the Beckermann–Townsend Frobenius guarantee. The key ideas combine a finite-dimensional analog of AAK theory with a log-scale bucket sparsification over Vandermonde nodes and universal ridge-leverage bounds for Vandermonde matrices, enabling a two-part decomposition $H=H_1+H_2$ and sublinear regression-based recovery. The authors also establish a near-tight lower bound on the approximate rank and show robustness to additive noise $E$, yielding $\|H-\widehat{H}\|_F \le C\|E\|_F + \epsilon\|H\|_F$. They develop a sublinear-time algorithm that recovers $H$ in a compressed Hankel form $VD'V^T+\widehat{H}'$, with $V$ Vandermonde and $D'$ diagonal, and demonstrate applications to polynomial basis transforms and Hankel covariance estimation. Overall, the results offer a path to fast, accurate, structure-preserving low-rank approximations for Hankel matrices with broad computational implications.
Abstract
Hankel matrices are an important class of highly-structured matrices, arising across computational mathematics, engineering, and theoretical computer science. It is well-known that positive semidefinite (PSD) Hankel matrices are always approximately low-rank. In particular, a celebrated result of Beckermann and Townsend shows that, for any PSD Hankel matrix $H \in \mathbb{R}^{n \times n}$ and any $ε> 0$, letting $H_k$ be the best rank-$k$ approximation of $H$, $\|H-H_k\|_F \leq ε\|H\|_F$ for $k = O(\log n \log(1/ε))$. As such, PSD Hankel matrices are natural targets for low-rank approximation algorithms. We give the first such algorithm that runs in \emph{sublinear time}. In particular, we show how to compute, in $\polylog(n, 1/ε)$ time, a factored representation of a rank-$O(\log n \log(1/ε))$ Hankel matrix $\widehat{H}$ matching the error guarantee of Beckermann and Townsend up to constant factors. We further show that our algorithm is \emph{robust} -- given input $H+E$ where $E \in \mathbb{R}^{n \times n}$ is an arbitrary non-Hankel noise matrix, we obtain error $\|H - \widehat{H}\|_F \leq O(\|E\|_F) + ε\|H\|_F$. Towards this algorithmic result, our first contribution is a \emph{structure-preserving} existence result - we show that there exists a rank-$k$ \emph{Hankel} approximation to $H$ matching the error bound of Beckermann and Townsend. Our result can be interpreted as a finite-dimensional analog of the widely applicable AAK theorem, which shows that the optimal low-rank approximation of an infinite Hankel operator is itself Hankel. Armed with our existence result, and leveraging the well-known Vandermonde structure of Hankel matrices, we achieve our sublinear time algorithm using a sampling-based approach that relies on universal ridge leverage score bounds for Vandermonde matrices.
