Table of Contents
Fetching ...

Sublinear Time Low-Rank Approximation of Hankel Matrices

Michael Kapralov, Cameron Musco, Kshiteej Sheth

TL;DR

This work proves that PSD Hankel matrices admit sublinear-time, structure-preserving low-rank approximations in the Frobenius norm with rank $O(\log n\log(1/\epsilon))$, matching the Beckermann–Townsend Frobenius guarantee. The key ideas combine a finite-dimensional analog of AAK theory with a log-scale bucket sparsification over Vandermonde nodes and universal ridge-leverage bounds for Vandermonde matrices, enabling a two-part decomposition $H=H_1+H_2$ and sublinear regression-based recovery. The authors also establish a near-tight lower bound on the approximate rank and show robustness to additive noise $E$, yielding $\|H-\widehat{H}\|_F \le C\|E\|_F + \epsilon\|H\|_F$. They develop a sublinear-time algorithm that recovers $H$ in a compressed Hankel form $VD'V^T+\widehat{H}'$, with $V$ Vandermonde and $D'$ diagonal, and demonstrate applications to polynomial basis transforms and Hankel covariance estimation. Overall, the results offer a path to fast, accurate, structure-preserving low-rank approximations for Hankel matrices with broad computational implications.

Abstract

Hankel matrices are an important class of highly-structured matrices, arising across computational mathematics, engineering, and theoretical computer science. It is well-known that positive semidefinite (PSD) Hankel matrices are always approximately low-rank. In particular, a celebrated result of Beckermann and Townsend shows that, for any PSD Hankel matrix $H \in \mathbb{R}^{n \times n}$ and any $ε> 0$, letting $H_k$ be the best rank-$k$ approximation of $H$, $\|H-H_k\|_F \leq ε\|H\|_F$ for $k = O(\log n \log(1/ε))$. As such, PSD Hankel matrices are natural targets for low-rank approximation algorithms. We give the first such algorithm that runs in \emph{sublinear time}. In particular, we show how to compute, in $\polylog(n, 1/ε)$ time, a factored representation of a rank-$O(\log n \log(1/ε))$ Hankel matrix $\widehat{H}$ matching the error guarantee of Beckermann and Townsend up to constant factors. We further show that our algorithm is \emph{robust} -- given input $H+E$ where $E \in \mathbb{R}^{n \times n}$ is an arbitrary non-Hankel noise matrix, we obtain error $\|H - \widehat{H}\|_F \leq O(\|E\|_F) + ε\|H\|_F$. Towards this algorithmic result, our first contribution is a \emph{structure-preserving} existence result - we show that there exists a rank-$k$ \emph{Hankel} approximation to $H$ matching the error bound of Beckermann and Townsend. Our result can be interpreted as a finite-dimensional analog of the widely applicable AAK theorem, which shows that the optimal low-rank approximation of an infinite Hankel operator is itself Hankel. Armed with our existence result, and leveraging the well-known Vandermonde structure of Hankel matrices, we achieve our sublinear time algorithm using a sampling-based approach that relies on universal ridge leverage score bounds for Vandermonde matrices.

Sublinear Time Low-Rank Approximation of Hankel Matrices

TL;DR

This work proves that PSD Hankel matrices admit sublinear-time, structure-preserving low-rank approximations in the Frobenius norm with rank , matching the Beckermann–Townsend Frobenius guarantee. The key ideas combine a finite-dimensional analog of AAK theory with a log-scale bucket sparsification over Vandermonde nodes and universal ridge-leverage bounds for Vandermonde matrices, enabling a two-part decomposition and sublinear regression-based recovery. The authors also establish a near-tight lower bound on the approximate rank and show robustness to additive noise , yielding . They develop a sublinear-time algorithm that recovers in a compressed Hankel form , with Vandermonde and diagonal, and demonstrate applications to polynomial basis transforms and Hankel covariance estimation. Overall, the results offer a path to fast, accurate, structure-preserving low-rank approximations for Hankel matrices with broad computational implications.

Abstract

Hankel matrices are an important class of highly-structured matrices, arising across computational mathematics, engineering, and theoretical computer science. It is well-known that positive semidefinite (PSD) Hankel matrices are always approximately low-rank. In particular, a celebrated result of Beckermann and Townsend shows that, for any PSD Hankel matrix and any , letting be the best rank- approximation of , for . As such, PSD Hankel matrices are natural targets for low-rank approximation algorithms. We give the first such algorithm that runs in \emph{sublinear time}. In particular, we show how to compute, in time, a factored representation of a rank- Hankel matrix matching the error guarantee of Beckermann and Townsend up to constant factors. We further show that our algorithm is \emph{robust} -- given input where is an arbitrary non-Hankel noise matrix, we obtain error . Towards this algorithmic result, our first contribution is a \emph{structure-preserving} existence result - we show that there exists a rank- \emph{Hankel} approximation to matching the error bound of Beckermann and Townsend. Our result can be interpreted as a finite-dimensional analog of the widely applicable AAK theorem, which shows that the optimal low-rank approximation of an infinite Hankel operator is itself Hankel. Armed with our existence result, and leveraging the well-known Vandermonde structure of Hankel matrices, we achieve our sublinear time algorithm using a sampling-based approach that relies on universal ridge leverage score bounds for Vandermonde matrices.

Paper Structure

This paper contains 25 sections, 26 theorems, 160 equations, 2 figures, 1 algorithm.

Key Result

Theorem 1

Let $H \in \mathbb{R}^{n\times n}$ be a PSD Hankel matrix, $E\in \mathbb{R}^{n\times n}$ be an arbitrary noise matrix, and $\epsilon > 0$ be an error parameter. Given entrywise access to $H+E$, alg:noisy_hankel_recovery runs in $\mathop{\mathrm{\mathrm{polylog}}}\limits( n,1/\epsilon)$ time and retu

Figures (2)

  • Figure 1: Sparsification of buckets $B_1,B_2$ to exponentiated Chebyshev nodes $T_1,T_2$ in respective intervals.
  • Figure 2: Entrywise upper bounds on $E_{0,1}+E_{1,1}+E_{2,1}+\ldots$, the shaded and unshaded region in each term are entrywise $1$ and $0$ respectively.

Theorems & Definitions (58)

  • Theorem 1: Sublinear Time Hankel Low-Rank Approximation
  • Theorem 2: Existence of Accurate Hankel Low-Rank Approximations
  • Theorem 3: Lower Bound on the Approximate Rank of PSD Hankel Matrices
  • Definition 2.1: Moment vector
  • Definition 2.2: Real Vandermonde matrix
  • Claim 2.3
  • Lemma 2.4: Fiedler Factorization
  • Lemma 2.5: Simplified, full version in \ref{['sec:bucket_sparsification']}
  • Lemma 2.6: Simplified, full version in \ref{['sec:prelim_hankel']}
  • Lemma 2.7: Simplified, full version in \ref{['sec:hankel_spectral_lower_bd']}
  • ...and 48 more