Table of Contents
Fetching ...

Matrix-free stochastic calculation of operator norms without using adjoints

Jonas Bresch, Dirk A. Lorenz, Felix Schneppe, Maximilian Winkler

TL;DR

The paper introduces a matrix-free, adjoint-free stochastic algorithm to compute the operator norm ||A|| of a finite-dimensional linear map using only evaluations of A and minimal storage. By performing a Rayleigh-quotient ascent in random tangent directions with an exact line search, it proves almost-sure convergence to the global maximum and sublinear convergence for the associated eigenvector/eigenvalue equation. The method is demonstrated on synthetic and Radon-transform-inspired operators, including CT-related settings, and shown to remain effective when adjoint information is unavailable or unreliable. The work highlights practical benefits for adjoint-mismatch scenarios and opens avenues for extensions to complex spaces and multiple leading singular vectors.

Abstract

This paper considers the problem of computing the operator norm of a linear map between finite dimensional Hilbert spaces when only evaluations of the linear map are available and under restrictive storage assumptions. We propose a stochastic method of random search type to maximize the Rayleigh quotient and employ an exact line search in the random search directions. Moreover, we show that the proposed algorithm converges to the global maximum (the operator norm) almost surely and a sublinear convergence behavior for the corresponding eigenvector and eigenvalue equation. Finally, we illustrate the performance of the method with numerical experiments.

Matrix-free stochastic calculation of operator norms without using adjoints

TL;DR

The paper introduces a matrix-free, adjoint-free stochastic algorithm to compute the operator norm ||A|| of a finite-dimensional linear map using only evaluations of A and minimal storage. By performing a Rayleigh-quotient ascent in random tangent directions with an exact line search, it proves almost-sure convergence to the global maximum and sublinear convergence for the associated eigenvector/eigenvalue equation. The method is demonstrated on synthetic and Radon-transform-inspired operators, including CT-related settings, and shown to remain effective when adjoint information is unavailable or unreliable. The work highlights practical benefits for adjoint-mismatch scenarios and opens avenues for extensions to complex spaces and multiple leading singular vectors.

Abstract

This paper considers the problem of computing the operator norm of a linear map between finite dimensional Hilbert spaces when only evaluations of the linear map are available and under restrictive storage assumptions. We propose a stochastic method of random search type to maximize the Rayleigh quotient and employ an exact line search in the random search directions. Moreover, we show that the proposed algorithm converges to the global maximum (the operator norm) almost surely and a sublinear convergence behavior for the corresponding eigenvector and eigenvalue equation. Finally, we illustrate the performance of the method with numerical experiments.

Paper Structure

This paper contains 20 sections, 14 theorems, 66 equations, 7 figures, 3 tables, 1 algorithm.

Key Result

Proposition 2.4

For $h_k$ defined by eq:h and $a_{k}$ and $b_{k}$ defined by eq:def-ak-bk it holds that:

Figures (7)

  • Figure 1: Shape of $h_k$ from \ref{['eq:h']} for different cases of the signs of $a_k$ and $b_k$.
  • Figure 2: Visualization in 3d of the situation in the proof of Lemma \ref{['lem:inf_probability']}.
  • Figure 3: Illustration of Algorithm \ref{['alg:mafno-orth-exact']} in two dimensions. Almost every initialization (here, two of them are visualized, namely $\{v_1^0, v_2^0\}$, with their corresponding tangent spaces $\{v_1^0\}^\perp$ and $\{v_2^0\}^\perp$) shares a half circle with one of the two global maxima $v^{*}, -v^{*}$. Since the line search is exact, the method will arrive at $v^{*}$ after one step.
  • Figure 4: Visualization of the great circle in the $x_1$-$x_2$-plane of the $\mathbb{S}^2$: First two components of the iterates $v^{k}$ (for $k=1,\dots,5000$) of Algorithm \ref{['alg:mafno-orth-exact']} for $A = \mathop{\mathrm{diag}}\nolimits(1,1,0)$, only plotted when $|v^{k}_3| < \varepsilon$.
  • Figure 5: Results for 50 runs of Algorithm \ref{['alg:mafno-orth-exact']} for matrices $A = [1,0,...,0] \in \mathbb{R}^{1\times d}$. All plots are linear in the $x$-axis and logarithmic in the $y$-axis indicating linear convergence of the quantities in this case.
  • ...and 2 more figures

Theorems & Definitions (41)

  • Remark 2.1: Normalization of $v^{k}$ and $x^{k}$
  • Remark 2.2: Critical points
  • Remark 2.3: Distribution of the $x^{k}$
  • Proposition 2.4
  • proof
  • Remark 2.5: Convergence of the sequence $\|Av^{k}\|$
  • Remark 2.6: Algorithm \ref{['alg:mafno-orth-exact']} is a stochastic projected gradient method
  • Remark 2.7: Markov chain provided by $(v^k)_{k \in \mathbb{N}}$
  • Lemma 2.8
  • proof
  • ...and 31 more