Table of Contents
Fetching ...

Towards Exascale Computation for Turbomachinery Flows

Yuhang Fu, Weiqi Shen, Jiahuan Cui, Yao Zheng, Guangwen Yang, Zhao Liu, Jifa Zhang, Tingwei Ji, Fangfang Xie, Xiaojing Lv, Hanyue Liu, Xu Liu, Xiyang Liu, Xiaoyu Song, Guocheng Tao, Yan Yan, Paul Tucker, Steven A. E. Miller, Shirui Luo, Seid Koric, Weimin Zheng

TL;DR

This work advances exascale-ready turbomachinery LES by developing ZJU-FR (ZFR), a portable, high-order, unstructured CFD framework based on the Flux Reconstruction method. It demonstrates a three-level MPI–Athread–SIMD parallelization strategy and a point-wise kernel fusion approach on Sunway’s SW26010Pro, achieving a peak sustained performance of $115.8$ DP-PFLOPs on a $1.69$ billion-element, $865$ billion-DOF high-pressure turbine scenario. The study validates high-fidelity heat-transfer predictions against experiments and shows scalable performance up to hundreds of thousands of MPI ranks, underscoring exascale potential for complete-engine turbomachinery simulations. The results imply significant practical impact for engine efficiency and emission reductions, while providing a generalizable blueprint for deploying high-order LES on heterogeneous exascale architectures.

Abstract

A state-of-the-art large eddy simulation code has been developed to solve compressible flows in turbomachinery. The code has been engineered with a high degree of scalability, enabling it to effectively leverage the many-core architecture of the new Sunway system. A consistent performance of 115.8 DP-PFLOPs has been achieved on a high-pressure turbine cascade consisting of over 1.69 billion mesh elements and 865 billion Degree of Freedoms (DOFs). By leveraging a high-order unstructured solver and its portability to large heterogeneous parallel systems, we have progressed towards solving the grand challenge problem outlined by NASA, which involves a time-dependent simulation of a complete engine, incorporating all the aerodynamic and heat transfer components.

Towards Exascale Computation for Turbomachinery Flows

TL;DR

This work advances exascale-ready turbomachinery LES by developing ZJU-FR (ZFR), a portable, high-order, unstructured CFD framework based on the Flux Reconstruction method. It demonstrates a three-level MPI–Athread–SIMD parallelization strategy and a point-wise kernel fusion approach on Sunway’s SW26010Pro, achieving a peak sustained performance of DP-PFLOPs on a billion-element, billion-DOF high-pressure turbine scenario. The study validates high-fidelity heat-transfer predictions against experiments and shows scalable performance up to hundreds of thousands of MPI ranks, underscoring exascale potential for complete-engine turbomachinery simulations. The results imply significant practical impact for engine efficiency and emission reductions, while providing a generalizable blueprint for deploying high-order LES on heterogeneous exascale architectures.

Abstract

A state-of-the-art large eddy simulation code has been developed to solve compressible flows in turbomachinery. The code has been engineered with a high degree of scalability, enabling it to effectively leverage the many-core architecture of the new Sunway system. A consistent performance of 115.8 DP-PFLOPs has been achieved on a high-pressure turbine cascade consisting of over 1.69 billion mesh elements and 865 billion Degree of Freedoms (DOFs). By leveraging a high-order unstructured solver and its portability to large heterogeneous parallel systems, we have progressed towards solving the grand challenge problem outlined by NASA, which involves a time-dependent simulation of a complete engine, incorporating all the aerodynamic and heat transfer components.
Paper Structure (34 sections, 3 equations, 15 figures, 4 tables, 2 algorithms)

This paper contains 34 sections, 3 equations, 15 figures, 4 tables, 2 algorithms.

Figures (15)

  • Figure 1: Main components of gas turbine and turbomachinery flow features: (a) schematic of core components of a gas turbine; (b) illustration of a turbo-fan engine; (c) enthalpy field of hot streaks incoming from combustor into the high pressure turbine; (d) turbulent flow structures near the turbine endwall; (e) CAD model of a real turbine blade; (f) clean-up CAD model of a turbine blade with cooling holes for CFD calculation; (g) temperature field on high pressure turbine blade surface with cooling holes.
  • Figure 2: The architecture of SW26010Pro many-core processor on the new Sunway supercomputer.
  • Figure 3: Overview of the code structure.
  • Figure 4: The muti-level parallelization design.
  • Figure 5: DAG for data dependencies.
  • ...and 10 more figures