Neural operators meet conjugate gradients: The FCG-NO method for efficient PDE solving
Alexander Rudikov, Vladimir Fanaskov, Ekaterina Muravleva, Yuri M. Laevsky, Ivan Oseledets
TL;DR
This work introduces FCG-NO, a hybrid approach that embeds a discretization-invariant neural operator as a nonlinear preconditioner for the flexible conjugate gradient method to solve elliptic PDEs. The method leverages a Krylov-subspace–based training scheme and an energy-norm Notay loss to train a Fourier-based spectral neural operator, enabling cross-resolution applicability from low- to high-resolution discretizations. Empirical results show that NO-based preconditioning outperforms classical preconditioners across grids and that training on Krylov residuals is essential for robust convergence, with Notay-loss yielding faster convergence than a conventional $L_2$ loss. The approach provides a principled way to combine consistency of traditional solvers with the efficiency of neural surrogates, achieving discretization-invariant performance and cross-resolution generalization while retaining convergence guarantees from FCG theory.
Abstract
Deep learning solvers for partial differential equations typically have limited accuracy. We propose to overcome this problem by using them as preconditioners. More specifically, we apply discretization-invariant neural operators to learn preconditioners for the flexible conjugate gradient method (FCG). Architecture paired with novel loss function and training scheme allows for learning efficient preconditioners that can be used across different resolutions. On the theoretical side, FCG theory allows us to safely use nonlinear preconditioners that can be applied in $O(N)$ operations without constraining the form of the preconditioners matrix. To justify learning scheme components (the loss function and the way training data is collected) we perform several ablation studies. Numerical results indicate that our approach favorably compares with classical preconditioners and allows to reuse of preconditioners learned for lower resolution to the higher resolution data.
