Table of Contents
Fetching ...

Two Variations on the XTrace Algorithm

Eric Hallman

TL;DR

Problem: estimate $\operatorname{tr}(\mathbf{A})$ when only matvec access is available. Approach: two variants of XTrace are studied—(i) variance reduction by averaging over random orthogonal rotations of test vectors, and (ii) a full Krylov-space, low-rank deflation via LeaveOneOutFull to achieve unbiased estimation. Findings: rotation averaging yields modest variance reductions; full Krylov-space deflation yields substantial improvements for spectra with certain eigenvalue structures, with gains depending on the spectrum; resampling offers limited additional benefit in many cases. Impact: provides guidance on when Krylov-based trace estimation outperforms XTrace, along with an efficient implementation path and avenues for extending to $f(\mathbf{A})$-traces in graphs or related settings.

Abstract

This paper studies two potential modifications of XTrace (Epperly et al., SIMAX 45(1):1-23, 2024), a randomized algorithm for estimating the trace of a matrix. The first is a variance reduction step that averages the output of XTrace over right-multiplications of the test vectors by random orthogonal matrices. The second is to form a low-rank approximation to the matrix using the whole Krylov space produced by the test vectors, rather than the output of a single power iteration as is used by XTrace. Experiments on synthetic data show that the first modification offers only slight benefits in practice, while the second can lead to significant improvements depending on the spectrum of the matrix.

Two Variations on the XTrace Algorithm

TL;DR

Problem: estimate when only matvec access is available. Approach: two variants of XTrace are studied—(i) variance reduction by averaging over random orthogonal rotations of test vectors, and (ii) a full Krylov-space, low-rank deflation via LeaveOneOutFull to achieve unbiased estimation. Findings: rotation averaging yields modest variance reductions; full Krylov-space deflation yields substantial improvements for spectra with certain eigenvalue structures, with gains depending on the spectrum; resampling offers limited additional benefit in many cases. Impact: provides guidance on when Krylov-based trace estimation outperforms XTrace, along with an efficient implementation path and avenues for extending to -traces in graphs or related settings.

Abstract

This paper studies two potential modifications of XTrace (Epperly et al., SIMAX 45(1):1-23, 2024), a randomized algorithm for estimating the trace of a matrix. The first is a variance reduction step that averages the output of XTrace over right-multiplications of the test vectors by random orthogonal matrices. The second is to form a low-rank approximation to the matrix using the whole Krylov space produced by the test vectors, rather than the output of a single power iteration as is used by XTrace. Experiments on synthetic data show that the first modification offers only slight benefits in practice, while the second can lead to significant improvements depending on the spectrum of the matrix.

Paper Structure

This paper contains 12 sections, 5 theorems, 27 equations, 11 figures, 5 algorithms.

Key Result

Lemma 1

If $\mathbf{\Omega}\in \mathbb{R}^{N\times m}$ has i.i.d. $\mathcal{N}(0,1)$ entries, then it has distribution equal to a product of three independent terms $\mathbf{Q}\mathbf{U}\mathbf{R}$, where Furthermore, conditioning on $\mathcal{R}(\mathbf{\Omega})$ is equivalent to conditioning on $\mathbf{Q}$.

Figures (11)

  • Figure 1: The output of XTrace is not invariant under arbitrary rotations $\mathbf{\Omega}\mapsto \mathbf{\Omega}\mathbf{U}$.
  • Figure : (a) flat
  • Figure : (a) poly
  • Figure : (a) flat
  • Figure : (a) poly
  • ...and 6 more figures

Theorems & Definitions (9)

  • Lemma 1
  • Theorem 2
  • Proof 1
  • Lemma 3
  • Proof 2
  • Lemma 4
  • Proof 3
  • Theorem 5
  • Proof 4