Sublinear Time Low-Rank Approximation of Toeplitz Matrices
Cameron Musco, Kshiteej Sheth
TL;DR
The paper tackles the problem of computing a near-optimal low-rank approximation of PSD Toeplitz matrices from noisy, sublinear-access to entries. It introduces a robust sublinear-time algorithm that represents the output as a compressed Fourier-based form $\widetilde{T}=F_S D F_S^*$, enabling sublinear runtime in $d$ while achieving an error bound $\|T-\widetilde{T}\|_F \le C \max\{\|E\|_F,\|T-T_k\|_F\} + \delta\|T\|_F$. Core to the approach is a polynomial-time discrete-time off-grid sparse Fourier transform that recovers the dominant Fourier structure from time-domain samples, together with a heavy-light decomposition to control error from unrecovered frequencies. The method yields sublinear-time algorithms for Toeplitz low-rank approximation and covariance estimation, with favorable Frobenius-error guarantees and compression, improving over prior sublinear-query work that lacked sublinear-time practicality. This work advances theory and offers practical tools for covariance estimation and structured matrix approximations in signal processing and related domains.
Abstract
We present a sublinear time algorithm for computing a near optimal low-rank approximation to any positive semidefinite (PSD) Toeplitz matrix $T\in \mathbb{R}^{d\times d}$, given noisy access to its entries. In particular, given entrywise query access to $T+E$ for an arbitrary noise matrix $E\in \mathbb{R}^{d\times d}$, integer rank $k\leq d$, and error parameter $δ>0$, our algorithm runs in time $\text{poly}(k,\log(d/δ))$ and outputs (in factored form) a Toeplitz matrix $\widetilde{T} \in \mathbb{R}^{d \times d}$ with rank $\text{poly}(k,\log(d/δ))$ satisfying, for some fixed constant $C$, \begin{equation*} \|T-\widetilde{T}\|_F \leq C \cdot \max\{\|E\|_F,\|T-T_k\|_F\} + δ\cdot \|T\|_F. \end{equation*} Here $\|\cdot \|_F$ is the Frobenius norm and $T_k$ is the best (not necessarily Toeplitz) rank-$k$ approximation to $T$ in the Frobenius norm, given by projecting $T$ onto its top $k$ eigenvectors. Our result has the following applications. When $E = 0$, we obtain the first sublinear time near-relative-error low-rank approximation algorithm for PSD Toeplitz matrices, resolving the main open problem of Kapralov et al. SODA `23, whose algorithm had sublinear query complexity but exponential runtime. Our algorithm can also be applied to approximate the unknown Toeplitz covariance matrix of a multivariate Gaussian distribution, given sample access to this distribution, resolving an open question of Eldar et al. SODA `20. Our algorithm applies sparse Fourier transform techniques to recover a low-rank Toeplitz matrix using its Fourier structure. Our key technical contribution is the first polynomial time algorithm for \emph{discrete time off-grid} sparse Fourier recovery, which may be of independent interest.
