A Numerical Analysis of Sketched Linear Squares Problems and Stopping Criteria for Iterative Solvers
Zhongxiao Jia, Xinyuan Wan
TL;DR
The paper analyzes sketched least squares (sLS) via subspace embeddings to accelerate LS solving, deriving sharp bounds on residual differences $\|r_{ls}-r_s\|$ in terms of embedding distortion $\epsilon$ and establishing a backward-error interpretation that links the sketched and original problems through perturbations with $\|E\| \le \epsilon \|A\|$. It then develops two general-purpose stopping criteria for iterative solvers (LSQR/LSMR) that terminate at the earliest iteration when further work cannot improve the original LS solution within the randomized framework, supported by rigorous bounds on normal-equation residuals. Theoretical results are complemented by numerical experiments using SRHT, Gaussian, and sparse embeddings, showing reliable early termination and substantial computational savings without sacrificing attainable accuracy. The work advances the integration of randomization and perturbation analysis in classical iterative solvers and provides practical guidelines for robust, efficient computation in large-scale LS problems.
Abstract
Randomized subspace embedding methods have had a great impact on the solution of a linear least squares (LS) problem by reducing its row dimension, leading to a randomized or sketched LS (sLS) problem, and use the solution of the sLS problem as an approximate solution of the LS problem. This work makes a numerical analysis on the sLS problem, establishes its numerous theoretical properties, and show their crucial roles on the most effective and efficient use of iterative solvers. We first establish a compact bound on the norm of the residual difference between the solutions of the LS and sLS problems, which is the first key result towards understanding the rationale of the sLS problem. Then from the perspective of backward errors, we prove that the solution of the sLS problem is the one of a certain perturbed LS problem with minimal backward error, and quantify how the embedded quality affects the residuals, solution errors, and the relative residual norms of normal equations of the LS and sLS problems. These theoretical results enable us to propose new novel and reliable general-purpose stopping criteria for iterative solvers for the sLS problem, which dynamically monitor stabilization patterns of iterative solvers for the LS problem itself and terminate them at the earliest iteration. Numerical experiments justify the theoretical bounds and demonstrate that the new stopping criteria work reliably and result in a tremendous reduction in computational cost without sacrificing attainable accuracy.
