Table of Contents
Fetching ...

Sublinear Time Low-Rank Approximation of Toeplitz Matrices

Cameron Musco, Kshiteej Sheth

TL;DR

The paper tackles the problem of computing a near-optimal low-rank approximation of PSD Toeplitz matrices from noisy, sublinear-access to entries. It introduces a robust sublinear-time algorithm that represents the output as a compressed Fourier-based form $\widetilde{T}=F_S D F_S^*$, enabling sublinear runtime in $d$ while achieving an error bound $\|T-\widetilde{T}\|_F \le C \max\{\|E\|_F,\|T-T_k\|_F\} + \delta\|T\|_F$. Core to the approach is a polynomial-time discrete-time off-grid sparse Fourier transform that recovers the dominant Fourier structure from time-domain samples, together with a heavy-light decomposition to control error from unrecovered frequencies. The method yields sublinear-time algorithms for Toeplitz low-rank approximation and covariance estimation, with favorable Frobenius-error guarantees and compression, improving over prior sublinear-query work that lacked sublinear-time practicality. This work advances theory and offers practical tools for covariance estimation and structured matrix approximations in signal processing and related domains.

Abstract

We present a sublinear time algorithm for computing a near optimal low-rank approximation to any positive semidefinite (PSD) Toeplitz matrix $T\in \mathbb{R}^{d\times d}$, given noisy access to its entries. In particular, given entrywise query access to $T+E$ for an arbitrary noise matrix $E\in \mathbb{R}^{d\times d}$, integer rank $k\leq d$, and error parameter $δ>0$, our algorithm runs in time $\text{poly}(k,\log(d/δ))$ and outputs (in factored form) a Toeplitz matrix $\widetilde{T} \in \mathbb{R}^{d \times d}$ with rank $\text{poly}(k,\log(d/δ))$ satisfying, for some fixed constant $C$, \begin{equation*} \|T-\widetilde{T}\|_F \leq C \cdot \max\{\|E\|_F,\|T-T_k\|_F\} + δ\cdot \|T\|_F. \end{equation*} Here $\|\cdot \|_F$ is the Frobenius norm and $T_k$ is the best (not necessarily Toeplitz) rank-$k$ approximation to $T$ in the Frobenius norm, given by projecting $T$ onto its top $k$ eigenvectors. Our result has the following applications. When $E = 0$, we obtain the first sublinear time near-relative-error low-rank approximation algorithm for PSD Toeplitz matrices, resolving the main open problem of Kapralov et al. SODA `23, whose algorithm had sublinear query complexity but exponential runtime. Our algorithm can also be applied to approximate the unknown Toeplitz covariance matrix of a multivariate Gaussian distribution, given sample access to this distribution, resolving an open question of Eldar et al. SODA `20. Our algorithm applies sparse Fourier transform techniques to recover a low-rank Toeplitz matrix using its Fourier structure. Our key technical contribution is the first polynomial time algorithm for \emph{discrete time off-grid} sparse Fourier recovery, which may be of independent interest.

Sublinear Time Low-Rank Approximation of Toeplitz Matrices

TL;DR

The paper tackles the problem of computing a near-optimal low-rank approximation of PSD Toeplitz matrices from noisy, sublinear-access to entries. It introduces a robust sublinear-time algorithm that represents the output as a compressed Fourier-based form , enabling sublinear runtime in while achieving an error bound . Core to the approach is a polynomial-time discrete-time off-grid sparse Fourier transform that recovers the dominant Fourier structure from time-domain samples, together with a heavy-light decomposition to control error from unrecovered frequencies. The method yields sublinear-time algorithms for Toeplitz low-rank approximation and covariance estimation, with favorable Frobenius-error guarantees and compression, improving over prior sublinear-query work that lacked sublinear-time practicality. This work advances theory and offers practical tools for covariance estimation and structured matrix approximations in signal processing and related domains.

Abstract

We present a sublinear time algorithm for computing a near optimal low-rank approximation to any positive semidefinite (PSD) Toeplitz matrix , given noisy access to its entries. In particular, given entrywise query access to for an arbitrary noise matrix , integer rank , and error parameter , our algorithm runs in time and outputs (in factored form) a Toeplitz matrix with rank satisfying, for some fixed constant , \begin{equation*} \|T-\widetilde{T}\|_F \leq C \cdot \max\{\|E\|_F,\|T-T_k\|_F\} + δ\cdot \|T\|_F. \end{equation*} Here is the Frobenius norm and is the best (not necessarily Toeplitz) rank- approximation to in the Frobenius norm, given by projecting onto its top eigenvectors. Our result has the following applications. When , we obtain the first sublinear time near-relative-error low-rank approximation algorithm for PSD Toeplitz matrices, resolving the main open problem of Kapralov et al. SODA `23, whose algorithm had sublinear query complexity but exponential runtime. Our algorithm can also be applied to approximate the unknown Toeplitz covariance matrix of a multivariate Gaussian distribution, given sample access to this distribution, resolving an open question of Eldar et al. SODA `20. Our algorithm applies sparse Fourier transform techniques to recover a low-rank Toeplitz matrix using its Fourier structure. Our key technical contribution is the first polynomial time algorithm for \emph{discrete time off-grid} sparse Fourier recovery, which may be of independent interest.
Paper Structure (26 sections, 27 theorems, 119 equations, 5 algorithms)

This paper contains 26 sections, 27 theorems, 119 equations, 5 algorithms.

Key Result

Theorem 1

Let $T\in \mathbb{R}^{d\times d}$ be a PSD Toeplitz matrix, $E\in \mathbb{R}^{d\times d}$ be an arbitrary noise matrix, $\delta>0$ be an error parameter, and $k$ be an integer rank parameter. There exists an algorithm that, given query access to the entries of $T+E$, runs in $\poly(k,\log(d/\delta)) where $T_k = \mathop{\mathrm{arg\,min}}\limits_{B: \mathop{\mathrm{rank}}\nolimits(B)\leq k}\|T-B\|

Theorems & Definitions (73)

  • Theorem 1: Robust Sublinear Time Toeplitz Low-Rank Approximation
  • Theorem 2: Sublinear Time Toeplitz Low-Rank Approximation
  • Theorem 3: Sublinear Time Toeplitz Covariance Matrix Estimation
  • Definition 2.1: Fourier matrix
  • Theorem 4: Theorem 2 of kapralov2022toeplitz
  • Lemma 2.2: Approximate Frequency Recovery -- Informal
  • Definition 3.1: Discrete-time Fourier transform (DTFT)
  • Lemma 3.2: Parseval's identity
  • Definition 3.3: Wrap around distance
  • Lemma 3.4: Discrete version of Lemma 6.6 of chen2016fourier
  • ...and 63 more