Table of Contents
Fetching ...

Learning from Linear Algebra: A Graph Neural Network Approach to Preconditioner Design for Conjugate Gradient Solvers

Vladislav Trifonov, Alexander Rudikov, Oleg Iliev, Yuri M. Laevsky, Ivan Oseledets, Ekaterina Muravleva

TL;DR

This work tackles the challenge of solving large SPD linear systems from parametric PDE discretizations by designing a neural preconditioner, PreCorrector, that learns corrections to classical ILU/IC preconditioners via a graph neural network. By targeting spectral improvements, particularly through a loss that emphasizes low-frequency components using a Hutchinson estimator, the approach aims to substantially reduce the condition number of $P^{-1}A$ and accelerate Conjugate Gradient convergence. The authors introduce an inplace ILU update mechanism, a graph-based correction framework $L(\theta)=L+\alpha\cdot\mathrm{GNN}(\theta,L)$, and a carefully constructed diffusion-coefficient dataset with a complexity metric to measure problem difficulty. Empirical results show that PreCorrector can outperform traditional preconditioners with the same sparsity and offer advantages over prior neural approaches, with competitive total time-to-solution and improved spectral behavior, while demonstrating transferability across grids and datasets. The work advances the practical use of neural methods to accelerate linear solvers for parametric PDEs and highlights a path forward for spectral-aware preconditioner design.

Abstract

Large linear systems are ubiquitous in modern computational science and engineering. The main recipe for solving them is the use of Krylov subspace iterative methods with well-designed preconditioners. Recently, GNNs have been shown to be a promising tool for designing preconditioners to reduce the overall computational cost of iterative methods by constructing them more efficiently than with classical linear algebra techniques. Preconditioners designed with these approaches cannot outperform those designed with classical methods in terms of the number of iterations in CG. In our work, we recall well-established preconditioners from linear algebra and use them as a starting point for training the GNN to obtain preconditioners that reduce the condition number of the system more significantly than classical preconditioners. Numerical experiments show that our approach outperforms both classical and neural network-based methods for an important class of parametric partial differential equations. We also provide a heuristic justification for the loss function used and show that preconditioners obtained by learning with this loss function reduce the condition number in a more desirable way for CG.

Learning from Linear Algebra: A Graph Neural Network Approach to Preconditioner Design for Conjugate Gradient Solvers

TL;DR

This work tackles the challenge of solving large SPD linear systems from parametric PDE discretizations by designing a neural preconditioner, PreCorrector, that learns corrections to classical ILU/IC preconditioners via a graph neural network. By targeting spectral improvements, particularly through a loss that emphasizes low-frequency components using a Hutchinson estimator, the approach aims to substantially reduce the condition number of and accelerate Conjugate Gradient convergence. The authors introduce an inplace ILU update mechanism, a graph-based correction framework , and a carefully constructed diffusion-coefficient dataset with a complexity metric to measure problem difficulty. Empirical results show that PreCorrector can outperform traditional preconditioners with the same sparsity and offer advantages over prior neural approaches, with competitive total time-to-solution and improved spectral behavior, while demonstrating transferability across grids and datasets. The work advances the practical use of neural methods to accelerate linear solvers for parametric PDEs and highlights a path forward for spectral-aware preconditioner design.

Abstract

Large linear systems are ubiquitous in modern computational science and engineering. The main recipe for solving them is the use of Krylov subspace iterative methods with well-designed preconditioners. Recently, GNNs have been shown to be a promising tool for designing preconditioners to reduce the overall computational cost of iterative methods by constructing them more efficiently than with classical linear algebra techniques. Preconditioners designed with these approaches cannot outperform those designed with classical methods in terms of the number of iterations in CG. In our work, we recall well-established preconditioners from linear algebra and use them as a starting point for training the GNN to obtain preconditioners that reduce the condition number of the system more significantly than classical preconditioners. Numerical experiments show that our approach outperforms both classical and neural network-based methods for an important class of parametric partial differential equations. We also provide a heuristic justification for the loss function used and show that preconditioners obtained by learning with this loss function reduce the condition number in a more desirable way for CG.
Paper Structure (32 sections, 10 equations, 12 figures, 12 tables, 1 algorithm)

This paper contains 32 sections, 10 equations, 12 figures, 12 tables, 1 algorithm.

Figures (12)

  • Figure 1: Inplace updating of IC($0$) factor allows to obtain better preconditioner.
  • Figure 2: Ablation study of the PreCorrector's architecture. Top row - preconditioners are constructed from IC($0$). Bottom row - preconditioners are constructed from ICt($1$). PreCor w/ GNN - PreCorrector architecture described in Section \ref{['sec:learn_corrrection_ilu']}. PreCor w/ MLP - PreCorrector processor block is changed to MLP. PreCor w/ MLP, static diag - same as before without updating the main diagonal. Lower is better.
  • Figure 3: Coefficient function $k(x) = \exp{(\phi(x))}$ for grid $128\times128$ with different variances.
  • Figure 4: Residuals vs CG iterations for grid $64\times64$. Lower is better.
  • Figure 5: Wall time vs CG iterations for grid $64\times64$. Lower is better.
  • ...and 7 more figures