Table of Contents
Fetching ...

Improving Efficiency of Parallel Across the Method Spectral Deferred Corrections

Gayatri Čaklović, Thibaut Lunet, Sebastian Götschel, Daniel Ruprecht

TL;DR

This work addresses the efficiency of parallelism across the method for spectral deferred corrections (SDC) by introducing optimized diagonal preconditioners. It develops an analytic framework to derive three coefficient families (MIN-SR-NS for non-stiff, MIN-SR-S for stiff, and MIN-SR-FLEX as a nonstationary variant) that keep convergence order high while expanding stability regions. The paper demonstrates, through Dahlquist tests and benchmark problems (Lorenz, Prothero-Robinson, Allen-Cahn), that these parallel SDC variants can outperform traditional parallel SDC approaches and certain Runge-Kutta schemes in both accuracy-per-work and practical cost, with a cost model aligned to wall-clock behavior. The results suggest that optimized diagonal SDC can deliver efficient, scalable time integration on modern parallel hardware, motivating further theoretical and implementation work, including proofs of A-stability for certain configurations and broader problem classes.

Abstract

Parallel-across-the method time integration can provide small scale parallelism when solving initial value problems. Spectral deferred corrections (SDC) with a diagonal sweeper, which is closely related to iterated Runge-Kutta methods proposed by Van der Houwen and Sommeijer, can use a number of threads equal to the number of quadrature nodes in the underlying collocation method. However, convergence speed, efficiency and stability depends critically on the used coefficients. Previous approaches have used numerical optimization to find good parameters. Instead, we propose an ansatz that allows to find optimal parameters analytically. We show that the resulting parallel SDC methods provide stability domains and convergence order very similar to those of well established serial SDC variants. Using a model for computational cost that assumes 80% efficiency of an implementation of parallel SDC we show that our variants are competitive with serial SDC, previously published parallel SDC coefficients as well as Picard iteration, explicit RKM-4 and an implicit fourth-order diagonally implicit Runge-Kutta method.

Improving Efficiency of Parallel Across the Method Spectral Deferred Corrections

TL;DR

This work addresses the efficiency of parallelism across the method for spectral deferred corrections (SDC) by introducing optimized diagonal preconditioners. It develops an analytic framework to derive three coefficient families (MIN-SR-NS for non-stiff, MIN-SR-S for stiff, and MIN-SR-FLEX as a nonstationary variant) that keep convergence order high while expanding stability regions. The paper demonstrates, through Dahlquist tests and benchmark problems (Lorenz, Prothero-Robinson, Allen-Cahn), that these parallel SDC variants can outperform traditional parallel SDC approaches and certain Runge-Kutta schemes in both accuracy-per-work and practical cost, with a cost model aligned to wall-clock behavior. The results suggest that optimized diagonal SDC can deliver efficient, scalable time integration on modern parallel hardware, motivating further theoretical and implementation work, including proofs of A-stability for certain configurations and broader problem classes.

Abstract

Parallel-across-the method time integration can provide small scale parallelism when solving initial value problems. Spectral deferred corrections (SDC) with a diagonal sweeper, which is closely related to iterated Runge-Kutta methods proposed by Van der Houwen and Sommeijer, can use a number of threads equal to the number of quadrature nodes in the underlying collocation method. However, convergence speed, efficiency and stability depends critically on the used coefficients. Previous approaches have used numerical optimization to find good parameters. Instead, we propose an ansatz that allows to find optimal parameters analytically. We show that the resulting parallel SDC methods provide stability domains and convergence order very similar to those of well established serial SDC variants. Using a model for computational cost that assumes 80% efficiency of an implementation of parallel SDC we show that our variants are competitive with serial SDC, previously published parallel SDC coefficients as well as Picard iteration, explicit RKM-4 and an implicit fourth-order diagonally implicit Runge-Kutta method.
Paper Structure (20 sections, 6 theorems, 63 equations, 17 figures, 1 table)

This paper contains 20 sections, 6 theorems, 63 equations, 17 figures, 1 table.

Key Result

Proposition 2.3

\newlabelprop:monomials0 For $1 \leq n \leq M-1$, let $\boldsymbol{\tau}^{n} \in \mathbb{R}^M$ be the vector $\left( \tau_1^n, \ldots, \tau_M^n\right)$ and $\tau^n \in P_{M-1}$ the monomial $t \mapsto t^n$. Then

Figures (17)

  • Figure 1: Bijective mapping between $\mathbb{R}^M$ and $P_{M-1}$.
  • Figure 1: Convergence of SDC for the Dahlquist test equation with MIN-SR-NS preconditioner using $K=1,\ldots,M$ sweeps per per time step. Dashed lines with slopes one to $K$ are shown as a guide to the eye. Left: $M=4$ Radau-Right nodes, right: $M=5$ Lobatto nodes.
  • Figure 1: Error vs. time step size for SDC for the Lorenz problem using $M=4$ Radau-Right nodes, for $K=1, \ldots, 5$ sweeps per time step. Left: MIN-SR-NS preconditioner, right: PIC preconditioner. Dashed gray lines with slopes from 1 to 6 are shown as a guide to the eye.
  • Figure 2: Convergence of SDC for the Dahlquist test equation with MIN-SR-S preconditioner using $K=1,\ldots,M$ sweeps per per time step. Dashed lines with slopes one to $K$ are shown as a guide to the eye. Left: $M=4$ Radau-Right nodes, right: $M=5$ Lobatto nodes.
  • Figure 2: Error vs. cost for SDC for the Lorenz problem using $M=4$ Radau-Right nodes and $K=4$ sweeps. Left: comparison with classical SDC preconditioners, right: comparison with efficient time integration methods from the literature and SDC with VDHS preconditioner.
  • ...and 12 more figures

Theorems & Definitions (19)

  • Remark 2.1
  • Remark 2.2
  • Proposition 2.3
  • Proof 1
  • Remark 2.4
  • Proposition 2.5
  • Proof 2
  • Proposition 2.6
  • Proof 3
  • Proposition 2.7
  • ...and 9 more