Table of Contents
Fetching ...

Dynamically accelerating the power iteration with momentum

Christian Austin, Sara Pollock, Yunrong Zhu

TL;DR

The paper addresses accelerating power and inverse power iterations without prior spectral information by introducing a dynamic momentum method that updates the momentum parameter $\beta_k$ at every iteration based on the Rayleigh quotient and two residuals, with no extra matrix-vector multiplies. It analyzes the convergence and stability of this approach via an augmented-matrix framework and proves asymptotic convergence to the dominant eigenpair, including acceleration in the symmetric case; it also reveals that the static optimal choice $\beta = \lambda_2^2/4$ yields a defective augmented matrix. Numerical experiments across multiple benchmark suites demonstrate that the dynamic method often outperforms both standard power iteration and static momentum, and it extends effectively to shifted inverse iterations. The work thereby provides a practical, low-cost accelerator for large-scale eigenvalue problems with broad applicability in computational science and machine learning.

Abstract

In this paper, we propose, analyze and demonstrate a dynamic momentum method to accelerate power and inverse power iterations with minimal computational overhead. The method can be applied to real diagonalizable matrices, is provably convergent with acceleration in the symmetric case, and does not require a priori spectral knowledge. We review and extend background results on previously developed static momentum accelerations for the power iteration through the connection between the momentum accelerated iteration and the standard power iteration applied to an augmented matrix. We show that the augmented matrix is defective for the optimal parameter choice. We then present our dynamic method which updates the momentum parameter at each iteration based on the Rayleigh quotient and two previous residuals. We present convergence and stability theory for the method by considering a power-like method consisting of multiplying an initial vector by a sequence of augmented matrices. We demonstrate the developed method on a number of benchmark problems, and see that it outperforms both the power iteration and often the static momentum acceleration with optimal parameter choice. Finally, we present and demonstrate an explicit extension of the algorithm to inverse power iterations.

Dynamically accelerating the power iteration with momentum

TL;DR

The paper addresses accelerating power and inverse power iterations without prior spectral information by introducing a dynamic momentum method that updates the momentum parameter at every iteration based on the Rayleigh quotient and two residuals, with no extra matrix-vector multiplies. It analyzes the convergence and stability of this approach via an augmented-matrix framework and proves asymptotic convergence to the dominant eigenpair, including acceleration in the symmetric case; it also reveals that the static optimal choice yields a defective augmented matrix. Numerical experiments across multiple benchmark suites demonstrate that the dynamic method often outperforms both standard power iteration and static momentum, and it extends effectively to shifted inverse iterations. The work thereby provides a practical, low-cost accelerator for large-scale eigenvalue problems with broad applicability in computational science and machine learning.

Abstract

In this paper, we propose, analyze and demonstrate a dynamic momentum method to accelerate power and inverse power iterations with minimal computational overhead. The method can be applied to real diagonalizable matrices, is provably convergent with acceleration in the symmetric case, and does not require a priori spectral knowledge. We review and extend background results on previously developed static momentum accelerations for the power iteration through the connection between the momentum accelerated iteration and the standard power iteration applied to an augmented matrix. We show that the augmented matrix is defective for the optimal parameter choice. We then present our dynamic method which updates the momentum parameter at each iteration based on the Rayleigh quotient and two previous residuals. We present convergence and stability theory for the method by considering a power-like method consisting of multiplying an initial vector by a sequence of augmented matrices. We demonstrate the developed method on a number of benchmark problems, and see that it outperforms both the power iteration and often the static momentum acceleration with optimal parameter choice. Finally, we present and demonstrate an explicit extension of the algorithm to inverse power iterations.
Paper Structure (16 sections, 7 theorems, 50 equations, 6 figures, 4 tables, 5 algorithms)

This paper contains 16 sections, 7 theorems, 50 equations, 6 figures, 4 tables, 5 algorithms.

Key Result

Proposition 2.1

Suppose $A$ satisfies assumption assume:l1diag. Then the $2n$ (counting multiplicity) eigenvalues of $A_\beta$ are given by In the case that $\lambda^2 - 4 \beta \ne 0$, the eigenvectors of $A_\beta$ corresponding to each eigenvalue $\mu = \mu_{\lambda_{\pm}}$ are given by where $\phi$ is the eigenvector of $A$ corresponding to eigenvalue $\lambda$. In the case that $\beta =\lambda^2/4 > 0$, the

Figures (6)

  • Figure 1: A comparison of $\rho(r)$ vs. $r^p$ for $\rho(r) = r/(1+\sqrt{1-r^2})$, the rate given in \ref{['eqn:cor-momrate']}. Left: $\rho(r)$ compared with $r^p$, for $p = 1,2,3,4,6,10$. Right: a detail plot of $\rho(r)$ compared with $r^p$, for $p = 6,10,14,20$. The crossings between $\rho(r)$ and $r^p$ are marked in each plot.
  • Figure 2: Left: The ratio of eigenvalues $\mu_{\lambda_+}/\mu_{\lambda_1}$ and $\mu_{\lambda_-}/\mu_{\lambda_1}$ of the augmented matrix $A_\beta$ for $A = \mathop{\mathrm{diag}}\nolimits(10:-1:-9)$ with $\beta = 9^2/4$ (inner circle) and $\beta = 9.9^2/4$ (outer circle). Right: convergence of the eigenmodes $\psi_1, \psi_2, \psi_8$ and $\psi_{64}$ of the augmented matrix $A_\beta$ for $A = \mathop{\mathrm{diag}}\nolimits(100:-1:1)$ and $\beta = 99^2/4$. All the subdominant modes converge at the same rate, but with increasing oscillation.
  • Figure 3: Convergence of the residual by iteration count for the three matrices in test suite 1, using algorithm \ref{['alg:pow']}, DMPOW with 20, 100 and 500 preliminary iterations, algorithm \ref{['alg:hbpow']}, and algorithm \ref{['alg:dymo']}. Left: Matrix 1, $\mathop{\mathrm{diag}}\nolimits(1000:-1:1)$; center: Matrix 2, Kuu; right: Matrix 3, Muu.
  • Figure 4: Convergence of the residual by iteration count for the three matrices in test suite 4, using algorithm \ref{['alg:pow']}, algorithm \ref{['alg:hbpow']} with $\beta = \beta_{opt} = \lambda_2^2/4$, with $\beta = \min\{1.01\times\beta_{opt}, (3\lambda_1^2 + \lambda_2^2)/16\}$, $\beta = 0.99\times\beta_{opt}$, and and algorithm \ref{['alg:dymo']}. Left: Matrix 8, Si5H12; center: Matrix 9, ss1; right: Matrix 10, thermomech_TC.
  • Figure 5: Behavior of $\beta_k$ with respect to $\beta_{opt}$ for representative examples from table \ref{['tab:inv']}, illustrating that $\beta_k$ stabilizes closer to $\beta_{opt}$ in agreement with lemma \ref{['lem:rstab']} as $r \rightarrow 1$. Left: $\sigma = 1001$, for which $r = 0.5$. Center: $\sigma = 1016$, for which $r \approx 0.94$. Right: $\sigma = 1064$ for which $r \approx 0.98$.
  • ...and 1 more figures

Theorems & Definitions (17)

  • Proposition 2.1
  • Corollary 2.2
  • proof
  • Corollary 2.3
  • Remark 2.4
  • proof
  • Remark 3.1
  • Lemma 3.2
  • proof
  • Lemma 3.3
  • ...and 7 more