Subspace-constrained randomized coordinate descent for linear systems with good low-rank matrix approximations
Jackie Lok, Elizaveta Rebrova
TL;DR
This work introduces SC-RCD, a memory-efficient solver for large PSD linear systems that leverages a cheap, low-rank Nyström approximation computed via RPCholesky to constrain RCD updates to an affine subspace. The subspace constraint acts as an implicit preconditioner, making convergence depend on the spectrum of the residual $\mathbf{A}^{\circ}=\mathbf{A}-\mathbf{A}\langle\mathcal{S}\rangle$ rather than the full matrix spectrum, and enabling robust performance when the original system has large spectral outliers. The authors develop a general subspace-constrained sketch-and-project framework and prove linear convergence under suitable conditions, with concrete complexity bounds that scale favorably when the spectrum decays rapidly. Numerical experiments on synthetic PSD systems and kernel ridge regression problems demonstrate that SC-RCD can outperform standard RCD and competitive solvers while using modest memory, illustrating its practical potential for large-scale, dense linear systems.
Abstract
The randomized coordinate descent (RCD) method is a classical algorithm with simple, lightweight iterations that is widely used for various optimization problems, including the solution of positive semidefinite linear systems. As a linear solver, RCD is particularly effective when the matrix is well-conditioned; however, its convergence rate deteriorates rapidly in the presence of large spectral outliers. In this paper, we introduce the subspace-constrained randomized coordinate descent (SC-RCD) method, in which the dynamics of RCD are restricted to an affine subspace corresponding to a column Nyström approximation, efficiently computed using the recently analyzed RPCholesky algorithm. We prove that SC-RCD converges at a rate that is unaffected by large spectral outliers, making it an effective and memory-efficient solver for large-scale, dense linear systems with rapidly decaying spectra, such as those encountered in kernel ridge regression. Experimental validation and comparisons with related solvers based on coordinate descent and the conjugate gradient method demonstrate the efficiency of SC-RCD. Our theoretical results are derived by developing a more general subspace-constrained framework for the sketch-and-project method. This framework, which may be of independent interest, generalizes popular algorithms such as randomized Kaczmarz and coordinate descent, and provides a flexible, implicit preconditioning strategy for a variety of iterative solvers.
