Collect, Commit, Expand: Efficient CPQR-Based Column Selection for Extremely Wide Matrices

Robin Armstrong; Anil Damle

Collect, Commit, Expand: Efficient CPQR-Based Column Selection for Extremely Wide Matrices

Robin Armstrong, Anil Damle

TL;DR

This paper introduces CCEQR, a deterministic CPQR-based method for selecting a small subset of columns from extremely wide matrices. By organizing the pivoting as collect–commit–expand cycles, it concentrates computational work on small candidate/tracked column sets and shifts most reflections to BLAS-3, while provably recovering the same column permutation as the Golub–Businger algorithm and achieving GB$(k)$ form. The approach significantly accelerates CSSP in applications where column-norm distributions are highly nonuniform, as demonstrated on spectral clustering and density functional theory problems, with notable speedups over GEQP3 and robust performance in structured scenarios. The work also provides a formal equivalence proof, practical details for updating compact WY representations, and public code to enable reproducibility and adoption in large-scale, column-rich contexts.

Abstract

Column-pivoted QR (CPQR) factorization is a computational primitive used in numerous applications that require selecting a small set of ``representative'' columns from a much larger matrix. These include applications in spectral clustering, model-order reduction, low-rank approximation, and computational quantum chemistry, where the matrix being factorized has a moderate number of rows but an extremely large number of columns. We describe a modification of the Golub-Businger algorithm which, for many matrices of this type, can perform CPQR-based column selection much more efficiently. This algorithm, which we call CCEQR, is based on a three-step ``collect, commit, expand'' strategy that limits the number of columns being manipulated, while also transferring more computational effort from level-2 BLAS to level-3. Unlike most CPQR algorithms that exploit level-3 BLAS, CCEQR is deterministic, and provably recovers a column permutation equivalent to the one computed by the Golub-Businger algorithm. Tests on spectral clustering and Wannier basis localization problems demonstrate that on appropriately structured problems, CCEQR can significantly outperform GEQP3.

Collect, Commit, Expand: Efficient CPQR-Based Column Selection for Extremely Wide Matrices

TL;DR

Abstract

Collect, Commit, Expand: Efficient CPQR-Based Column Selection for Extremely Wide Matrices

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (7)