Table of Contents
Fetching ...

GPU-accelerated Linear Algebra for Coupled Solvers in Industrial CFD Applications with OpenFOAM

Stefano Oliani, Ettore Fadiga, Ivan Spisso, Luigi Capone, Federico Piscaglia

TL;DR

This work tackles accelerating industrial CFD with OpenFOAM by offloading the costly linear algebra of implicit coupled solvers to GPUs using AmgX, enabling both density-based and pressure-based coupled formulations. It develops AmgxCoupled4Foam to interface ICSFoam with GPU solvers, preserving a heterogeneous CPU/GPU workflow and exploiting block-structured matrices for efficient CSR representations. Two industrial test cases, NASA CRM in transonic flow and DrivAer external aerodynamics, demonstrate substantial gains: NASA CRM achieves over 4x overall speedup, while DrivAer shows improved stability and reduced compute time compared to segregated solvers, with robust convergence in the coupled framework. The results validate the approach as a practical path toward industrial-scale GPU-accelerated CFD, revealing that linear algebra offloading is especially advantageous for coupled systems and highlighting memory and kernel optimization as avenues for further gains. Overall, this work marks the first implementation of GPU-accelerated coupled implicit simulations in OpenFOAM and points to significant performance and robustness benefits for complex geometries on modern GPU-enabled HPC architectures.

Abstract

The present work describes the development of heterogeneous GPGPU implicit CFD coupled solvers, encompassing both density- and pressure- based approaches. In this setup, the assembled linear matrix is offloaded onto multiple GPUs using specialized external libraries to solve the linear problem efficiently. These coupled solvers are applied to two industrial test cases representing common scenarios: the NASA CRM in a transonic regime and the external aerodynamics study of the DriveAER car. Significant performance enhancements are evident when compared to their CPU counterparts. Specifically, the NASA CRM case achieves an overall speedup of more than 4x, while the DriveAER test case demonstrates improved stability and reduced computational time compared to segregated solvers. All calculations were carried out utilizing the GPU-based partition of the davinci-1 supercomputer at the Leonardo Labs, featuring 82 GPU-accelerated nodes.

GPU-accelerated Linear Algebra for Coupled Solvers in Industrial CFD Applications with OpenFOAM

TL;DR

This work tackles accelerating industrial CFD with OpenFOAM by offloading the costly linear algebra of implicit coupled solvers to GPUs using AmgX, enabling both density-based and pressure-based coupled formulations. It develops AmgxCoupled4Foam to interface ICSFoam with GPU solvers, preserving a heterogeneous CPU/GPU workflow and exploiting block-structured matrices for efficient CSR representations. Two industrial test cases, NASA CRM in transonic flow and DrivAer external aerodynamics, demonstrate substantial gains: NASA CRM achieves over 4x overall speedup, while DrivAer shows improved stability and reduced compute time compared to segregated solvers, with robust convergence in the coupled framework. The results validate the approach as a practical path toward industrial-scale GPU-accelerated CFD, revealing that linear algebra offloading is especially advantageous for coupled systems and highlighting memory and kernel optimization as avenues for further gains. Overall, this work marks the first implementation of GPU-accelerated coupled implicit simulations in OpenFOAM and points to significant performance and robustness benefits for complex geometries on modern GPU-enabled HPC architectures.

Abstract

The present work describes the development of heterogeneous GPGPU implicit CFD coupled solvers, encompassing both density- and pressure- based approaches. In this setup, the assembled linear matrix is offloaded onto multiple GPUs using specialized external libraries to solve the linear problem efficiently. These coupled solvers are applied to two industrial test cases representing common scenarios: the NASA CRM in a transonic regime and the external aerodynamics study of the DriveAER car. Significant performance enhancements are evident when compared to their CPU counterparts. Specifically, the NASA CRM case achieves an overall speedup of more than 4x, while the DriveAER test case demonstrates improved stability and reduced computational time compared to segregated solvers. All calculations were carried out utilizing the GPU-based partition of the davinci-1 supercomputer at the Leonardo Labs, featuring 82 GPU-accelerated nodes.
Paper Structure (12 sections, 11 equations, 16 figures, 3 tables, 1 algorithm)

This paper contains 12 sections, 11 equations, 16 figures, 3 tables, 1 algorithm.

Figures (16)

  • Figure 1: Sub-blocks and source term structure for a system of coupled equations with one scalar and one vector variables.
  • Figure 2: Schematic procedure of coupled matrix values conversion into an array in block AoS format. Permutation of the blocks is needed to switch from OpenFOAM native LDU storage to block CSR format needed in AmgX.
  • Figure 3: Procedure for matrix partitioning and coupled processor patches coefficients insertion in distributed parallel matrix
  • Figure 4: Aerodynamic coefficients convergence history for the first 2500 iterations. CPU and GPU-offloaded linear algebra are represented. On the top and bottom right, closeup of lift and drag coefficient are shown, respectively.
  • Figure 5: Predicted total, pressure and skin friction drag coefficients. Results of other solvers from DPW6 are denoted by the smaller dots, while OpenFOAM results are represented by the larger squares on the right of the figure.
  • ...and 11 more figures