Mixed precision thin SVD algorithms based on the Gram matrix

Erin Carson; Yuxin Ma; Meiyue Shao

Mixed precision thin SVD algorithms based on the Gram matrix

Erin Carson, Yuxin Ma, Meiyue Shao

Abstract

In this work, we present a mixed precision algorithm that leverages the Gram matrix and Jacobi methods to compute the singular value decomposition (SVD) of tall-and-skinny matrices. By constructing the Gram matrix in higher precision and coupling it with a Jacobi algorithm, our theoretical analysis and numerical experiments both indicate that the singular values computed by this mixed precision thin SVD algorithm attain high relative accuracy. In practice, our mixed precision thin SVD algorithm yields speedups of over 10x on a single CPU and about 2x on distributed memory systems when compared with traditional thin SVD methods.

Mixed precision thin SVD algorithms based on the Gram matrix

Abstract

Paper Structure (18 sections, 4 theorems, 71 equations, 4 figures, 3 tables, 2 algorithms)

This paper contains 18 sections, 4 theorems, 71 equations, 4 figures, 3 tables, 2 algorithms.

Introduction
Mixed precision thin SVD algorithm
Eigensolver used in Line \ref{['line:eigen']} of Algorithm \ref{['alg:mpthinsvd']}
Accuracy
Accuracy of Algorithm \ref{['alg:mpthinsvd']}
Proof of rowwise backward stability
Proof of high relative accuracy of computed singular values
Proof of orthonormality of $\hat{U}$
Accuracy of the eigensolver in Algorithm \ref{['alg:eigen']}
Improved bound for the Cholesky QR algorithm
Numerical Experiments
Tests on CPU
Tests for accuracy
Tests for performance on one CPU
Tests on the distributed memory system
...and 3 more sections

Key Result

Theorem 1

Assume that $\hat{\sigma}_i$ with $i\leq n$ are the computed singular values of $A = BD$ computed by Algorithm alg:mpthinsvd, where the columns of $B$ have unit norms and $D$ is a diagonal matrix containing the column norms of $A$. Also, assume that eq:ATA--eq:AV-0 are satisfied with given $\varepsi If it further holds that then and

Figures (4)

Figure 1: Accuracy comparison among different algorithms for computing SVD of tall-and-skinny matrices. In this figure, the results for $80$ test matrices are presented, labeled as No.1 through No.80 along the $x$-axis.
Figure 2: Performance comparison on CPU among different algorithms for computing SVD of tall-and-skinny random matrices. For each ratio $m/n$, the five bars from left to right represent the result of mpthinSVD-Jacobi, mpthinSVD-Algo 2, Jacobi SVD (sGEJSV), QR SVD (sGESVD), and D&C SVD (sGESDD), respectively.
Figure 3: Performance comparison on distributed memory system between different algorithms for computing SVD of tall-and-skinny random matrices with fixed number of rows, i.e., $m$.
Figure 4: Performance comparison on distributed memory system between different algorithms for computing SVD of tall-and-skinny random matrices with fixed number of rows per node, i.e., $m/p$, where $p$ denotes the number of nodes.

Theorems & Definitions (10)

Theorem 1
proof
Remark 1
Theorem 2
proof
Lemma 1
proof
Remark 2
Theorem 3
proof

Mixed precision thin SVD algorithms based on the Gram matrix

Abstract

Mixed precision thin SVD algorithms based on the Gram matrix

Authors

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (10)