Table of Contents
Fetching ...

Mixed precision thin SVD algorithms based on the Gram matrix

Erin Carson, Yuxin Ma, Meiyue Shao

Abstract

In this work, we present a mixed precision algorithm that leverages the Gram matrix and Jacobi methods to compute the singular value decomposition (SVD) of tall-and-skinny matrices. By constructing the Gram matrix in higher precision and coupling it with a Jacobi algorithm, our theoretical analysis and numerical experiments both indicate that the singular values computed by this mixed precision thin SVD algorithm attain high relative accuracy. In practice, our mixed precision thin SVD algorithm yields speedups of over 10x on a single CPU and about 2x on distributed memory systems when compared with traditional thin SVD methods.

Mixed precision thin SVD algorithms based on the Gram matrix

Abstract

In this work, we present a mixed precision algorithm that leverages the Gram matrix and Jacobi methods to compute the singular value decomposition (SVD) of tall-and-skinny matrices. By constructing the Gram matrix in higher precision and coupling it with a Jacobi algorithm, our theoretical analysis and numerical experiments both indicate that the singular values computed by this mixed precision thin SVD algorithm attain high relative accuracy. In practice, our mixed precision thin SVD algorithm yields speedups of over 10x on a single CPU and about 2x on distributed memory systems when compared with traditional thin SVD methods.
Paper Structure (18 sections, 4 theorems, 71 equations, 4 figures, 3 tables, 2 algorithms)

This paper contains 18 sections, 4 theorems, 71 equations, 4 figures, 3 tables, 2 algorithms.

Key Result

Theorem 1

Assume that $\hat{\sigma}_i$ with $i\leq n$ are the computed singular values of $A = BD$ computed by Algorithm alg:mpthinsvd, where the columns of $B$ have unit norms and $D$ is a diagonal matrix containing the column norms of $A$. Also, assume that eq:ATA--eq:AV-0 are satisfied with given $\varepsi If it further holds that then and

Figures (4)

  • Figure 1: Accuracy comparison among different algorithms for computing SVD of tall-and-skinny matrices. In this figure, the results for $80$ test matrices are presented, labeled as No.1 through No.80 along the $x$-axis.
  • Figure 2: Performance comparison on CPU among different algorithms for computing SVD of tall-and-skinny random matrices. For each ratio $m/n$, the five bars from left to right represent the result of mpthinSVD-Jacobi, mpthinSVD-Algo 2, Jacobi SVD (sGEJSV), QR SVD (sGESVD), and D&C SVD (sGESDD), respectively.
  • Figure 3: Performance comparison on distributed memory system between different algorithms for computing SVD of tall-and-skinny random matrices with fixed number of rows, i.e., $m$.
  • Figure 4: Performance comparison on distributed memory system between different algorithms for computing SVD of tall-and-skinny random matrices with fixed number of rows per node, i.e., $m/p$, where $p$ denotes the number of nodes.

Theorems & Definitions (10)

  • Theorem 1
  • proof
  • Remark 1
  • Theorem 2
  • proof
  • Lemma 1
  • proof
  • Remark 2
  • Theorem 3
  • proof