Scalable Binary CUR Low-Rank Approximation Algorithm
Bowen Su
TL;DR
The paper tackles scalable low-rank approximation for very large matrices by developing a Scalable Binary CUR algorithm that deterministically selects representative rows and columns in parallel. It combines a blockwise Adaptive Cross Approximation framework with a binary parallel selection mechanism to form CUR factors $(C,U,R)$ efficiently, achieving a per-iteration cost of $\mathcal{O}(r\,nm/b)$ and practical speedups on multi-core hardware. Empirical results on Hilbert and synthetic low-rank matrices show near-optimal reconstruction as the target rank $r$ grows, while scalability experiments on $16384\times16384$ matrices demonstrate substantial, though sublinear, speedups with increasing process counts due to parallel overhead. The approach offers a practical, deterministic route to accurate CUR-based low-rank approximations for large-scale data in applications requiring scalable matrix factorization.
Abstract
This paper proposes a scalable binary CUR low-rank approximation algorithm that leverages parallel selection of representative rows and columns within a deterministic framework. By employing a blockwise adaptive cross approximation strategy, the algorithm efficiently identifies dominant components in large-scale matrices, thereby reducing computational costs. Numerical experiments on $16,384 \times 16,384$ matrices demonstrate a good speed-up, with execution time decreasing from $12.37$ seconds using $2$ processes to $1.02$ seconds using $64$ processes. The tests on Hilbert matrices and synthetic low-rank matrices of different size across various sizes demonstrate an near-optimal reconstruction accuracy.
