Block subsampled randomized Hadamard transform for low-rank approximation on distributed architectures
Oleg Balabanov, Matthias Beaupere, Laura Grigori, Victor Lederer
TL;DR
This work introduces a block subsampled randomized Hadamard transform (block SRHT) as a distributed-friendly oblivious subspace embedding for low-rank approximation. By constructing $\mathbf{\Omega}$ blockwise from SRHTs, the authors prove that the block SRHT achieves the $(\varepsilon,\delta,d)$-OSE with a row count comparable to standard SRHT, while enabling efficient communication-lean application on distributed architectures. They unify and analyze randomized low-rank methods—RSVD, Nyström, and single-view—within a projection-based framework that relies solely on the OSE property, ensuring compatibility with block SRHT. Numerical experiments on large SPD matrices and tall-and-skinny problems show that block SRHT matches Gaussian embeddings in accuracy but offers up to about $2.5$-fold speedups in practical distributed settings, with strong and weak scalability up to thousands of processors. The results indicate that block SRHT provides practically significant performance benefits without sacrificing theoretical guarantees, making it well-suited for large-scale, distributed numerical linear algebra tasks.
Abstract
This article introduces a novel structured random matrix composed blockwise from subsampled randomized Hadamard transforms (SRHTs). The block SRHT is expected to outperform well-known dimension reduction maps, including SRHT and Gaussian matrices, on distributed architectures with not too many cores compared to the dimension. We prove that a block SRHT with enough rows is an oblivious subspace embedding, i.e., an approximate isometry for an arbitrary low-dimensional subspace with high probability. Our estimate of the required number of rows is similar to that of the standard SRHT. This suggests that the two transforms should provide the same accuracy of approximation in the algorithms. The block SRHT can be readily incorporated into randomized methods, for instance to compute a low-rank approximation of a large-scale matrix. For completeness, we revisit some common randomized approaches for this problem such as Randomized Singular Value Decomposition and Nyström approximation, with a discussion of their accuracy and implementation on distributed architectures.
