Table of Contents
Fetching ...

Learning efficient and provably convergent splitting methods

L. M. Kreusser, H. E. Lockyer, E. H. Müller, P. Singh

TL;DR

This work proposes a framework for finding machine learned splitting methods that are computationally efficient for large timesteps and have provable convergence and conservation guarantees in the small-timestep limit and demonstrates numerically that the learned methods can be significantly more efficient than established methods for the Schr\"{o}dinger equation if the computational budget is limited.

Abstract

Splitting methods are widely used for solving initial value problems (IVPs) due to their ability to simplify complicated evolutions into more manageable subproblems which can be solved efficiently and accurately. Traditionally, these methods are derived using analytic and algebraic techniques from numerical analysis, including truncated Taylor series and their Lie algebraic analogue, the Baker--Campbell--Hausdorff formula. These tools enable the development of high-order numerical methods that provide exceptional accuracy for small timesteps. Moreover, these methods often (nearly) conserve important physical invariants, such as mass, unitarity, and energy. However, in many practical applications the computational resources are limited. Thus, it is crucial to identify methods that achieve the best accuracy within a fixed computational budget, which might require taking relatively large timesteps. In this regime, high-order methods derived with traditional methods often exhibit large errors since they are only designed to be asymptotically optimal. Machine Learning techniques offer a potential solution since they can be trained to efficiently solve a given IVP with less computational resources. However, they are often purely data-driven, come with limited convergence guarantees in the small-timestep regime and do not necessarily conserve physical invariants. In this work, we propose a framework for finding machine learned splitting methods that are computationally efficient for large timesteps and have provable convergence and conservation guarantees in the small-timestep limit. We demonstrate numerically that the learned methods, which by construction converge quadratically in the timestep size, can be significantly more efficient than established methods for the Schrödinger equation if the computational budget is limited.

Learning efficient and provably convergent splitting methods

TL;DR

This work proposes a framework for finding machine learned splitting methods that are computationally efficient for large timesteps and have provable convergence and conservation guarantees in the small-timestep limit and demonstrates numerically that the learned methods can be significantly more efficient than established methods for the Schr\"{o}dinger equation if the computational budget is limited.

Abstract

Splitting methods are widely used for solving initial value problems (IVPs) due to their ability to simplify complicated evolutions into more manageable subproblems which can be solved efficiently and accurately. Traditionally, these methods are derived using analytic and algebraic techniques from numerical analysis, including truncated Taylor series and their Lie algebraic analogue, the Baker--Campbell--Hausdorff formula. These tools enable the development of high-order numerical methods that provide exceptional accuracy for small timesteps. Moreover, these methods often (nearly) conserve important physical invariants, such as mass, unitarity, and energy. However, in many practical applications the computational resources are limited. Thus, it is crucial to identify methods that achieve the best accuracy within a fixed computational budget, which might require taking relatively large timesteps. In this regime, high-order methods derived with traditional methods often exhibit large errors since they are only designed to be asymptotically optimal. Machine Learning techniques offer a potential solution since they can be trained to efficiently solve a given IVP with less computational resources. However, they are often purely data-driven, come with limited convergence guarantees in the small-timestep regime and do not necessarily conserve physical invariants. In this work, we propose a framework for finding machine learned splitting methods that are computationally efficient for large timesteps and have provable convergence and conservation guarantees in the small-timestep limit. We demonstrate numerically that the learned methods, which by construction converge quadratically in the timestep size, can be significantly more efficient than established methods for the Schrödinger equation if the computational budget is limited.

Paper Structure

This paper contains 32 sections, 35 equations, 13 figures, 4 tables, 2 algorithms.

Figures (13)

  • Figure 1: Visualisation of the $\alpha$ and $\beta$ for two symmetric and consistent splitting schemes with $K=5$ (left) and $K=4$ (right) stages. Note how symmetry constrains half the parameters, including the trailing $\beta_K$ that is symbolically zero. Consistency fixes two additional parameters to ensure that the $\alpha$ and $\beta$ sum to one as in \ref{['eq:cons']}. \newlabelfig:paramTransform0
  • Figure 1: Plot of the loss function $\mathcal{L}(\gamma)$ in \ref{['eq:transLossFnFinite']} for the Schrödinger equation in \ref{['eq:potKinSchroedingerPDE']} with $T=10$ and $h=\frac{1}{7}$. A section of the one-dimensional manifold of fourth-order accurate methods can be seen in the lower right corner. To obtain this figure, the loss function was evaluated on the fixed validation set $\mathcal{B}=\mathcal{S}_\text{valid}$ with 200 members and for a uniform grid for the values of $\gamma$ in the box $[-0.5,0.4]^3$. Lower loss values are plotted as darker and larger points. \newlabelfig:lossLandscape0
  • Figure 1: Six randomly chosen initial conditions $u_0^{(j)}$ generated with Algorithm \ref{['alg:training_data_generation']}, with modulus, real and imaginary part of $u_0^{(j)}$ depicted in black, blue and red, resptively. In each of the plots, the double-well potential is also shown on a logarithmic scale in the background. \newlabelfig:sampleInitConds0
  • Figure 1: Training- and validation loss (left) and evolution of the splitting parameters for the three learned methods Learn5A, Learn8A and Learn8B where we used Adam for the stochastic optimisation. \newlabelfig:paramOptim0
  • Figure 2: Local environment of the validation loss visualised in Figure \ref{['fig:lossLandscape']} around the two minima $\gamma_{\mathrm{Strang}}$ and $\gamma_{\mathrm{valid}}$. The function is plotted in the three planes that are perpendicular to the largest (left), middle (centre), and smallest (right), eigenvalues of the Hessian matrix at the local minima. \newlabelfig:loss2dPlanes0
  • ...and 8 more figures