Table of Contents
Fetching ...

Stochastic Zeroth-Order Method for Computing Generalized Rayleigh Quotients

Jonas Bresch, Oleh Melnyk, Martin Schoen, Gabriele Steidl

TL;DR

This work addresses computing the maximal generalized Rayleigh quotient without adjoint or inverse operations by introducing a stochastic zeroth-order Riemannian ascent on the $B$-weighted sphere $\mathbb{S}_B^{d-1}$. The algorithm uses a closed-form step size for the one-sample case and extends to multi-sample gradient estimators, enabling efficient, adjoint-free progress toward global maximizers. The authors provide almost-sure convergence guarantees and sublinear rates, with strong empirical results demonstrating robustness to ill-conditioned $B$ and favorable comparisons to existing zeroth-order and Gen-Oja methods. The framework unifies zeroth-order Riemannian optimization with practical eigenproblem solvers and extends to complex settings and Karhunen–Loève-type problems, highlighting practical impact for large-scale generalized eigenvalue computations without matrix inverses or transposes.

Abstract

The maximization of the (generalized) Rayleigh quotient is a central problem in numerical linear algebra. Conventional algorithms for its computation typically rely on matrix-adjoint products, making them sensitive to errors arising from adjoint mismatches. To address this issue, we introduce a stochastic zeroth-order Riemannian algorithm that maximizes the generalized Rayleigh quotient without requiring adjoint or matrix inverse computations. We provide theoretical convergence guarantees showing that the iterates converge to the set of global maximizers of the (generalized) Rayleigh quotient at a sublinear rate with probability one. Our theoretical results are supported by numerical experiments, which demonstrate the excellent performance of the proposed method compared to state-of-the-art algorithms.

Stochastic Zeroth-Order Method for Computing Generalized Rayleigh Quotients

TL;DR

This work addresses computing the maximal generalized Rayleigh quotient without adjoint or inverse operations by introducing a stochastic zeroth-order Riemannian ascent on the -weighted sphere . The algorithm uses a closed-form step size for the one-sample case and extends to multi-sample gradient estimators, enabling efficient, adjoint-free progress toward global maximizers. The authors provide almost-sure convergence guarantees and sublinear rates, with strong empirical results demonstrating robustness to ill-conditioned and favorable comparisons to existing zeroth-order and Gen-Oja methods. The framework unifies zeroth-order Riemannian optimization with practical eigenproblem solvers and extends to complex settings and Karhunen–Loève-type problems, highlighting practical impact for large-scale generalized eigenvalue computations without matrix inverses or transposes.

Abstract

The maximization of the (generalized) Rayleigh quotient is a central problem in numerical linear algebra. Conventional algorithms for its computation typically rely on matrix-adjoint products, making them sensitive to errors arising from adjoint mismatches. To address this issue, we introduce a stochastic zeroth-order Riemannian algorithm that maximizes the generalized Rayleigh quotient without requiring adjoint or matrix inverse computations. We provide theoretical convergence guarantees showing that the iterates converge to the set of global maximizers of the (generalized) Rayleigh quotient at a sublinear rate with probability one. Our theoretical results are supported by numerical experiments, which demonstrate the excellent performance of the proposed method compared to state-of-the-art algorithms.

Paper Structure

This paper contains 18 sections, 21 theorems, 151 equations, 8 figures, 2 algorithms.

Key Result

Theorem 2.1

Let $f$ be defined by problem_riemann and $L \ge 2 \|A^{\textup{H}}\|(1 + \kappa(B))$, where $\kappa(B) \coloneq \|B\|/\|B^{-1}\|$ is the condition number of $B$. Then the sequence $(v^k)_{k=0}^{\infty}$ generated by eq: Riemmanian gradient ascent with $\tau_k = 1/L$ fulfills $\mathop{\mathrm{grad}}

Figures (8)

  • Figure 1: Visualization of the generalized Rayleigh quotient $r(A,B,v)$ for $v = (v_1,v_2)^\mathrm{T} \in \mathbb{R}^2\setminus\{0\}$ and $A = \left(3112\right)$. Its values on $\mathbb{S}_B^1$ are highlighted by the solid line. Left: $B= I_2$, Right: $B= \left(20.50.51\right)$.
  • Figure 2: Convergence of our algorithm for different sizes $d\in \{10,50,100,500\}$ and $m \in \{1,10,100\}$.
  • Figure 3: Error estimation towards Riemannian gradient for $d=100$ and $m \in \{1,10,100\}$. Left: Comparison of $\vert b_k\vert^2$ (solid lines) with $\tfrac{1}{d-1}\|\mathop{\mathrm{grad}}\limits f(v^k)\|^2$ (dashed lines). Right: Error $\|(d-1) x^{k} - \mathop{\mathrm{grad}}\limits f(v^k)\|$.
  • Figure 4: Convergence of our algorithm for ill-conditioned $B$ with $\kappa(B) \approx 10^q$ for $q = 1,2,3$ (left to right) and $m \in\{1,10,100\}$ for a fixed number of $(2q-1)\cdot 1000$ iterations.
  • Figure 5: Runtime of our algorithm for $m \in \{1,10,100\}$ for the matrices from Figure \ref{['fig:exp_2_time']} for varying dimension $d$. The methods are stopped whenever $\textrm{RQE} < 10^{-2}$ or after $100\cdot d$ iterations.
  • ...and 3 more figures

Theorems & Definitions (45)

  • Remark 2.1: Numerical abscissa
  • Theorem 2.1
  • proof
  • Theorem 2.2
  • Theorem 3.1
  • proof
  • Remark 4.1
  • Lemma 4.1
  • Lemma 4.2
  • Remark 4.2
  • ...and 35 more