Table of Contents
Fetching ...

Construction of Hierarchically Semi-Separable matrix Representation using Adaptive Johnson-Lindenstrauss Sketching

Yotam Yaniv, Pieter Ghysels, Osman Asif Malik, Henry A. Boateng, Xiaoye S. Li

TL;DR

This work extends adaptive HSS compression from Gaussian sketches to the broader class of Johnson-Lindenstrauss (JL) sketching operators, enabling faster, partially matrix-free HSS construction for large dense matrices. It establishes Frobenius-norm concentration and range-finder bounds for general JL sketches, including SJLT and SRHT, and implements these schemes in STRUMPACK with detailed SJLT and SRHT plumbing. Empirical results show up to 2.5× speedups in serial and up to 35× speedups in distributed settings, with modest compromises in accuracy. The generalized framework allows users to choose sketching operators with provable guarantees, yielding practical, scalable HSS compression for engineers and scientists.

Abstract

We present an extension of an adaptive, partially matrix-free, Hierarchically Semi-Separable (HSS) matrix construction algorithm by Gorman et al. [SIAM J. Sci. Comput. 41(5), 2019] which uses Gaussian sketching operators to a broader class of Johnson--Lindenstrauss (JL) sketching operators. We develop theoretical work which justifies this extension. In particular, we extend the earlier concentration bounds to all JL sketching operators and examine this bound for specific classes of such operators including the original Gaussian sketching operators, subsampled randomized Hadamard transform (SRHT) and the sparse Johnson--Lindenstrauss transform (SJLT). We discuss the implementation details of applying SJLT and SRHT efficiently. Then we demonstrate experimentally that using SJLT or SRHT instead of Gaussian sketching operators leads to up to 2.5x speedups of the serial HSS construction implementation in the STRUMPACK C++ library. Additionally, we discuss the implementation of a parallel distributed HSS construction that leverages Gaussian or SJLT sketching operators. We observe a performance improvement of up to 35x when using SJLT sketching operators over Gaussian sketching operators. The generalized algorithm allows users to select their own JL sketching operators with theoretical lower bounds on the size of the operators which may lead to faster run time with similar HSS construction accuracy.

Construction of Hierarchically Semi-Separable matrix Representation using Adaptive Johnson-Lindenstrauss Sketching

TL;DR

This work extends adaptive HSS compression from Gaussian sketches to the broader class of Johnson-Lindenstrauss (JL) sketching operators, enabling faster, partially matrix-free HSS construction for large dense matrices. It establishes Frobenius-norm concentration and range-finder bounds for general JL sketches, including SJLT and SRHT, and implements these schemes in STRUMPACK with detailed SJLT and SRHT plumbing. Empirical results show up to 2.5× speedups in serial and up to 35× speedups in distributed settings, with modest compromises in accuracy. The generalized framework allows users to choose sketching operators with provable guarantees, yielding practical, scalable HSS compression for engineers and scientists.

Abstract

We present an extension of an adaptive, partially matrix-free, Hierarchically Semi-Separable (HSS) matrix construction algorithm by Gorman et al. [SIAM J. Sci. Comput. 41(5), 2019] which uses Gaussian sketching operators to a broader class of Johnson--Lindenstrauss (JL) sketching operators. We develop theoretical work which justifies this extension. In particular, we extend the earlier concentration bounds to all JL sketching operators and examine this bound for specific classes of such operators including the original Gaussian sketching operators, subsampled randomized Hadamard transform (SRHT) and the sparse Johnson--Lindenstrauss transform (SJLT). We discuss the implementation details of applying SJLT and SRHT efficiently. Then we demonstrate experimentally that using SJLT or SRHT instead of Gaussian sketching operators leads to up to 2.5x speedups of the serial HSS construction implementation in the STRUMPACK C++ library. Additionally, we discuss the implementation of a parallel distributed HSS construction that leverages Gaussian or SJLT sketching operators. We observe a performance improvement of up to 35x when using SJLT sketching operators over Gaussian sketching operators. The generalized algorithm allows users to select their own JL sketching operators with theoretical lower bounds on the size of the operators which may lead to faster run time with similar HSS construction accuracy.
Paper Structure (38 sections, 15 theorems, 91 equations, 15 figures, 6 tables)

This paper contains 38 sections, 15 theorems, 91 equations, 15 figures, 6 tables.

Key Result

Lemma 1

Given $\varepsilon \in (0,1)$, let $m$ and $d$ be positive integers such that $d \geq 4 (\varepsilon^2/2 - \varepsilon^3/3)^{-1} \log m$. For any set $P$ of $m$ points in $\mathbb{R}^{n}$ there exists $f: \mathbb{R}^{n} \rightarrow \mathbb{R}^{d}$ such that for all $u, v \in P$

Figures (15)

  • Figure 1: (a) Illustration of a symmetric HSS matrix using $3$ levels. Diagonal blocks are partitioned recursively. Gray blocks denote the basis matrices. (b) Tree for the HSS matrix from (a), using topological ordering. All nodes except the root store $U_i$ (and $V_i$ for the non-symmetric case). Leaves store $D_i$, non-leaves $B_{ij}$ (and $B_{ji}$ for the non-symmetric case).
  • Figure 2: Serial HSS construction time and sketching time. Overall speedup compared to Gaussian sketching is shown at the top of each bar.
  • Figure 3: Oversampling ratios, the final $d$ over the HSS rank, for the largest test cases, covariance, impedance matrix (scattering wave), and frontal matrix. The quantum chemistry Toeplitz problem is omitted, since for this problem the rank are so small that it does not require any adaptation.
  • Figure 4: Covariance matrix HSS construction relative error and maximum off-diagonal ranks.
  • Figure 5: Quantum Chemistry Toeplitz matrix HSS construction relative error and maximum off-diagonal ranks.
  • ...and 10 more figures

Theorems & Definitions (34)

  • Remark 1
  • Remark 2
  • Lemma 1: Johnson--Lindenstrauss Lemma johnson1984extensions
  • Definition 1: JL Sketching Operator
  • Remark 3
  • Remark 4
  • Theorem 1
  • proof
  • Theorem 2: Theorem 5.2 in avron2011RandomizedAlgorithms
  • Theorem 3: Matrix version of result in KaneNelson14
  • ...and 24 more