Table of Contents
Fetching ...

Generalizing Random Butterfly Transforms to Arbitrary Matrix Sizes

Neil Lindquist, Piotr Luszczek, Jack Dongarra

TL;DR

This work expanded Parker’s theoretical analysis to generalized RBT, specifically that in exact arithmetic, Gaussian elimination with no pivoting will succeed with probability 1 after transforming a matrix with full-depth RBTs after transforming a matrix with full-depth RBTs.

Abstract

Parker and Lê introduced random butterfly transforms (RBTs) as a preprocessing technique to replace pivoting in dense LU factorization. Unfortunately, their FFT-like recursive structure restricts the dimensions of the matrix. Furthermore, on multi-node systems, efficient management of the communication overheads restricts the matrix's distribution even more. To remove these limitations, we have generalized the RBT to arbitrary matrix sizes by truncating the dimensions of each layer in the transform. We expanded Parker's theoretical analysis to generalized RBT, specifically that in exact arithmetic, Gaussian elimination with no pivoting will succeed with probability 1 after transforming a matrix with full-depth RBTs. Furthermore, we experimentally show that these generalized transforms improve performance over Parker's formulation by up to 62\% while retaining the ability to replace pivoting. This generalized RBT is available in the SLATE numerical software library.

Generalizing Random Butterfly Transforms to Arbitrary Matrix Sizes

TL;DR

This work expanded Parker’s theoretical analysis to generalized RBT, specifically that in exact arithmetic, Gaussian elimination with no pivoting will succeed with probability 1 after transforming a matrix with full-depth RBTs after transforming a matrix with full-depth RBTs.

Abstract

Parker and Lê introduced random butterfly transforms (RBTs) as a preprocessing technique to replace pivoting in dense LU factorization. Unfortunately, their FFT-like recursive structure restricts the dimensions of the matrix. Furthermore, on multi-node systems, efficient management of the communication overheads restricts the matrix's distribution even more. To remove these limitations, we have generalized the RBT to arbitrary matrix sizes by truncating the dimensions of each layer in the transform. We expanded Parker's theoretical analysis to generalized RBT, specifically that in exact arithmetic, Gaussian elimination with no pivoting will succeed with probability 1 after transforming a matrix with full-depth RBTs. Furthermore, we experimentally show that these generalized transforms improve performance over Parker's formulation by up to 62\% while retaining the ability to replace pivoting. This generalized RBT is available in the SLATE numerical software library.
Paper Structure (12 sections, 3 theorems, 19 equations, 7 figures, 1 table)

This paper contains 12 sections, 3 theorems, 19 equations, 7 figures, 1 table.

Key Result

Lemma 1

For any $1 \leq k \leq n$, let ${\alpha, \gamma \in S^{n}_{k}}$ and let be a constant with $O_{\mu,\nu}^{\langle n\rangle}$ defined by eq:O-matrix-definition. Then, $c(n, \alpha, \gamma) = 0$ if and only if $\alpha\not\equiv\gamma\pmod{\mu+\nu}$. Otherwise, $2^{-|\alpha|/2} \leq |c(n, \alpha, \gamma)| \leq 1$.

Figures (7)

  • Figure 1: Data dependencies for multiplying the vector $[x_1^T, x_2^T, x_3^T]^T$ by an orthogonal butterfly matrix to produce $[y_1^T, y_2^T, y_3^T]^T\!$.
  • Figure 2: The structure of a depth-3, semi-Parker rbt.
  • Figure 3: Accuracy of the rbt-solver without iterative refinement for the various sizes of the circul matrix.
  • Figure 4: Accuracy of the rbt-solver without iterative refinement for the various sizes of the chebspec matrix.
  • Figure 5: Accuracy of the rbt-solver without iterative refinement for the various sizes of the fiedler matrix.
  • ...and 2 more figures

Theorems & Definitions (6)

  • Lemma 1: Cf. Parker's 2 parkerRandomButterflyTransformations1995
  • proof
  • Lemma 2: Cf. Parker's 3 parkerRandomButterflyTransformations1995
  • proof
  • Theorem 3: Cf. Parker's 4 parkerRandomButterflyTransformations1995
  • proof