Table of Contents
Fetching ...

Scalability of the asynchronous discontinuous Galerkin method for compressible flow simulations

Shubham Kumar Goswami, Dapse Vidyesh, Konduri Aditya

Abstract

The scalability of time-dependent partial differential equation (PDE) solvers based on the discontinuous Galerkin (DG) method is increasingly limited by data communication and synchronization requirements across processing elements (PEs) at extreme scales. To address these challenges, asynchronous computing approaches that relax communication and synchronization at a mathematical level have been proposed. In particular, the asynchronous discontinuous Galerkin (ADG) method with asynchrony-tolerant (AT) fluxes has recently been shown to recover high-order accuracy under relaxed communication, supported by detailed analyses of its accuracy and stability. However, the scalability of this approach in modern large-scale parallel DG solvers has not yet been systematically investigated. In this paper, we address this gap by implementing the ADG method coupled with AT fluxes in the open-source finite element library deal.II. We employ a communication-avoiding algorithm (CAA) that reduces the frequency of inter-process communication while accommodating controlled delays in ghost value exchanges. We first demonstrate that applying standard numerical fluxes in this asynchronous setting degrades the solution to first-order accuracy, irrespective of the polynomial degree. By incorporating AT fluxes that utilize data from multiple previous time levels, we successfully recover the formal high-order accuracy of the DG discretization. The accuracy of the proposed method is rigorously verified using benchmark problems for the compressible Euler equations. Furthermore, we evaluate the performance of the method through extensive strong-scaling studies for both two- and three-dimensional test cases. Our results indicate that CAA substantially suppresses synchronization overheads, yielding speedups of up to 1.9x in two dimensions and 1.6x in three dimensions compared to a baseline synchronous DG solver.

Scalability of the asynchronous discontinuous Galerkin method for compressible flow simulations

Abstract

The scalability of time-dependent partial differential equation (PDE) solvers based on the discontinuous Galerkin (DG) method is increasingly limited by data communication and synchronization requirements across processing elements (PEs) at extreme scales. To address these challenges, asynchronous computing approaches that relax communication and synchronization at a mathematical level have been proposed. In particular, the asynchronous discontinuous Galerkin (ADG) method with asynchrony-tolerant (AT) fluxes has recently been shown to recover high-order accuracy under relaxed communication, supported by detailed analyses of its accuracy and stability. However, the scalability of this approach in modern large-scale parallel DG solvers has not yet been systematically investigated. In this paper, we address this gap by implementing the ADG method coupled with AT fluxes in the open-source finite element library deal.II. We employ a communication-avoiding algorithm (CAA) that reduces the frequency of inter-process communication while accommodating controlled delays in ghost value exchanges. We first demonstrate that applying standard numerical fluxes in this asynchronous setting degrades the solution to first-order accuracy, irrespective of the polynomial degree. By incorporating AT fluxes that utilize data from multiple previous time levels, we successfully recover the formal high-order accuracy of the DG discretization. The accuracy of the proposed method is rigorously verified using benchmark problems for the compressible Euler equations. Furthermore, we evaluate the performance of the method through extensive strong-scaling studies for both two- and three-dimensional test cases. Our results indicate that CAA substantially suppresses synchronization overheads, yielding speedups of up to 1.9x in two dimensions and 1.6x in three dimensions compared to a baseline synchronous DG solver.

Paper Structure

This paper contains 22 sections, 20 equations, 15 figures, 4 tables, 3 algorithms.

Figures (15)

  • Figure 1: Illustration of numerical flux computation across a common edge of two quadrilateral elements in a two-dimensional domain. Circles denote the local nodal degrees of freedom of the elements.
  • Figure 2: (a) Domain decomposition of $144$ elements among four PEs: PE-0 (green), PE-1 (orange), PE-2 (blue), and PE-3 (white); (b) classification of interior and PE-boundary elements in the subdomain of PE-0.
  • Figure 3: Illustration of a discretized one-dimensional domain decomposed into two processing elements (PEs): (a) synchronous execution and (b) asynchronous execution based on the communication-avoiding algorithm with maximum allowable delay $L = 4$.
  • Figure 4: Initial density profile for the two-dimensional domain for the isentropic vortex test case.
  • Figure 5: Domain decompositions for (a) the two-dimensional isentropic vortex test case with $4096$ elements distributed across $256$ PEs and (b) the three-dimensional flow around a cylinder test case with $25,600$ elements distributed across $320$ PEs.
  • ...and 10 more figures