Superfast iterative refinement of low rank approximation of a matrix based on ALS method and random sampling
Qi Luan, Victor Y. Pan
TL;DR
This work addresses the problem of computing low-rank approximations for massive matrices at sublinear cost by combining ALS-based iterative refinement with random sampling. It develops both deterministic and randomized (superfast) iterative refinement frameworks that progressively improve an initial CUR LRA toward a near-optimal solution, with convergence guarantees expressed in terms of principal-angle distances and under specific spectral-gap assumptions. A key contribution is the integration of leverage-score-based sampling and LLSP compression to produce near-optimal CUR LRA whp, while preserving sparsity and structure. Numerical experiments show rapid improvement within a few iterations, often outperforming initial LRAs and demonstrating practical sublinear performance on real-world data.
Abstract
A matrix algorithm runs superfast (aka at sublinear cost) if it involves much fewer flops and memory cells than an input matrix has entries. Big Data are frequently represented by matrices of immense sizes that cannot be handled directly but can be approximated with low rank matrices, with which one can operate superfast. Superfast computation of Low Rank Approximations (LRA) of a matrix, however, can be a challenge. Any superfast LRA algorithm fails miserably on worst case matrices. Fortunately, they rarely appear in computational practice. For an important example, superfast Adaptive Cross--Approximation iterations have consistently output accurate LRA of a large and important class of matrices during the decades of their worldwide application. Furthermore, they output CUR LRA, which is a special attractive form of LRA, widely applied in data analysis. %VP Adequate formal support for that empirical behavior, however, is still a challenge. Encouraged by success of these iterations we present a superfast randomized iterative refinement of a crude CUR LRA by means of combining random sampling with the Alternating Least Squares Method, which reduces this refinement to recursive solution of generalized Linear Least Squares Problems (LLSPs). We prove monotone convergence of our iterations with a high probability to a near-optimal LRA of an input matrix under some specified assumptions on it and on a crude approximate solution to an initial generalized LLSP. In our numerical tests two or three iterations of our algorithm have consistently and significantly improved crude initial LRAs of real world inputs.
