Table of Contents
Fetching ...

A DEIM-CUR factorization with iterative SVDs

Perfect Y. Gidisu, Michiel E. Hochstenbach

TL;DR

This study investigates the effectiveness of one-round sampling and iterative subselection techniques and introduces new iterative subselection strategies based on iterative SVDs and aims to improve the approximation quality of the DEIM scheme by iteratively invoking it in several rounds.

Abstract

A CUR factorization is often utilized as a substitute for the singular value decomposition (SVD), especially when a concrete interpretation of the singular vectors is challenging. Moreover, if the original data matrix possesses properties like nonnegativity and sparsity, a CUR decomposition can better preserve them compared to the SVD. An essential aspect of this approach is the methodology used for selecting a subset of columns and rows from the original matrix. This study investigates the effectiveness of \emph{one-round sampling} and iterative subselection techniques and introduces new iterative subselection strategies based on iterative SVDs. One provably appropriate technique for index selection in constructing a CUR factorization is the discrete empirical interpolation method (DEIM). Our contribution aims to improve the approximation quality of the DEIM scheme by iteratively invoking it in several rounds, in the sense that we select subsequent columns and rows based on the previously selected ones. Thus, we modify $A$ after each iteration by removing the information that has been captured by the previously selected columns and rows. We also discuss how iterative procedures for computing a few singular vectors of large data matrices can be integrated with the new iterative subselection strategies. We present the results of numerical experiments, providing a comparison of one-round sampling and iterative subselection techniques, and demonstrating the improved approximation quality associated with using the latter.

A DEIM-CUR factorization with iterative SVDs

TL;DR

This study investigates the effectiveness of one-round sampling and iterative subselection techniques and introduces new iterative subselection strategies based on iterative SVDs and aims to improve the approximation quality of the DEIM scheme by iteratively invoking it in several rounds.

Abstract

A CUR factorization is often utilized as a substitute for the singular value decomposition (SVD), especially when a concrete interpretation of the singular vectors is challenging. Moreover, if the original data matrix possesses properties like nonnegativity and sparsity, a CUR decomposition can better preserve them compared to the SVD. An essential aspect of this approach is the methodology used for selecting a subset of columns and rows from the original matrix. This study investigates the effectiveness of \emph{one-round sampling} and iterative subselection techniques and introduces new iterative subselection strategies based on iterative SVDs. One provably appropriate technique for index selection in constructing a CUR factorization is the discrete empirical interpolation method (DEIM). Our contribution aims to improve the approximation quality of the DEIM scheme by iteratively invoking it in several rounds, in the sense that we select subsequent columns and rows based on the previously selected ones. Thus, we modify after each iteration by removing the information that has been captured by the previously selected columns and rows. We also discuss how iterative procedures for computing a few singular vectors of large data matrices can be integrated with the new iterative subselection strategies. We present the results of numerical experiments, providing a comparison of one-round sampling and iterative subselection techniques, and demonstrating the improved approximation quality associated with using the latter.
Paper Structure (9 sections, 2 theorems, 14 equations, 3 figures, 6 tables, 7 algorithms)

This paper contains 9 sections, 2 theorems, 14 equations, 3 figures, 6 tables, 7 algorithms.

Key Result

Theorem 4.1

(Wedin's Theorem) Given $E\in\mathbb R^{m\times n}$, let be the SVD of $E$ (where the singular values are not necessarily nonincreasing). The singular subspaces of interest are in the column spaces of $U_1$ and $V_1$. Let the inexact/approximate singular subspaces be in the column spaces of $\widehat{U}_1$ and $\widehat{V}_1$ in the decomposition Now let $\Phi$ be the matrix of canonical angles

Figures (3)

  • Figure 1: Relative approximation errors for the various iterative subselection DEIM CUR approximation algorithms for $k = 30$. The right figure represents selecting a constant number of columns and rows per iteration; the left is the delta strategy. In all cases, increasing the number of rounds or delta does not lead to a monotonic decrease in the approximation errors.
  • Figure 2: Relative approximation errors as a function of $k$ for the various iterative subselection DEIM CUR approximation algorithms compared with some standard CUR approximation algorithms using real data sets.
  • Figure 3: Relative approximation errors for the various iterative subselection DEIM CUR approximation algorithms compared with one-round sampling schemes for a fixed rank $k$ with varying values of number of rounds $t=(2, 3, 5, 6, 10)$ for the CADP-CX and CADP-CUR algorithms and $\delta=(0.4,0.5,0.6,0.7,0.8)$ for the DADP-CX and DADP-CUR methods.

Theorems & Definitions (4)

  • Remark 3.1
  • Remark 3.2
  • Theorem 4.1
  • Theorem 5.1