Distributed computing for physics-based data-driven reduced modeling at scale: Application to a rotating detonation rocket engine

Ionut-Gabriel Farcas; Rayomand P. Gundevia; Ramakanth Munipalli; Karen E. Willcox

Distributed computing for physics-based data-driven reduced modeling at scale: Application to a rotating detonation rocket engine

Ionut-Gabriel Farcas, Rayomand P. Gundevia, Ramakanth Munipalli, Karen E. Willcox

TL;DR

The paper tackles the computational bottleneck of high-fidelity RDRE simulations by introducing a distributed memory data-driven ROM framework. It presents dOpInf, a distributed Operator Inference workflow that combines distributed data transformations, a POD-free dimensionality reduction, and parallel learning of reduced operators to produce physics-informed ROMs at scales unreachable by serial methods. On a real large-scale RDRE, the approach demonstrates strong and weak scaling up to 2{,}048 cores on Frontera, constructing a predictive ROM from $n_t=2{,}536$ snapshots of dimension $m=75{,}675{,}600$ in $13$ seconds and achieving a $9.0\times 10^4$ speedup over the full model, with ROM evaluations around $1.09$ seconds per core. The results show that the ROM captures large-scale RDRE features (e.g., three co-rotating waves) and enables efficient design exploration and uncertainty quantification, marking a practical step toward HPC-enabled data-driven propulsion design.

Abstract

High-performance computing (HPC) has revolutionized our ability to perform detailed simulations of complex real-world processes. A prominent contemporary example is from aerospace propulsion, where HPC is used for rotating detonation rocket engine (RDRE) simulations in support of the design of next-generation rocket engines; however, these simulations take millions of core hours even on powerful supercomputers, which makes them impractical for engineering tasks like design exploration and risk assessment. Data-driven reduced-order models (ROMs) aim to address this limitation by constructing computationally cheap yet sufficiently accurate approximations that serve as surrogates for the high-fidelity model. This paper contributes a distributed memory algorithm that achieves fast and scalable construction of predictive physics-based ROMs trained from sparse datasets of extremely large state dimension. The algorithm learns structured physics-based ROMs that approximate the dynamical systems underlying those datasets.This enables model reduction for problems at a scale and complexity that exceeds the capabilities of standard, serial approaches. We demonstrate our algorithm's scalability using up to $2,048$ cores on the Frontera supercomputer at the Texas Advanced Computing Center. We focus on a real-world three-dimensional RDRE for which one millisecond of simulated physical time requires one million core hours on a supercomputer. Using a training dataset of $2,536$ snapshots each of state dimension $76$ million, our distributed algorithm enables the construction of a predictive data-driven reduced model in just $13$ seconds on $2,048$ cores on Frontera.

Distributed computing for physics-based data-driven reduced modeling at scale: Application to a rotating detonation rocket engine

TL;DR

snapshots of dimension

seconds and achieving a

speedup over the full model, with ROM evaluations around

seconds per core. The results show that the ROM captures large-scale RDRE features (e.g., three co-rotating waves) and enables efficient design exploration and uncertainty quantification, marking a practical step toward HPC-enabled data-driven propulsion design.

Abstract

cores on the Frontera supercomputer at the Texas Advanced Computing Center. We focus on a real-world three-dimensional RDRE for which one millisecond of simulated physical time requires one million core hours on a supercomputer. Using a training dataset of

snapshots each of state dimension

million, our distributed algorithm enables the construction of a predictive data-driven reduced model in just

seconds on

cores on Frontera.

Paper Structure (13 sections, 1 theorem, 12 equations, 8 figures, 2 algorithms)

This paper contains 13 sections, 1 theorem, 12 equations, 8 figures, 2 algorithms.

Introduction
Setup for high-fidelity nonlinear physics-based simulations
dOpInf: A new distributed computing algorithm for fast and scalable learning of nonlinear physics-based reduced models
Distributed computation of data transformations and dimensionality reduction
Distributed learning of the reduced model operators
Distributed postprocessing of the reduced solution
Algorithm summary
Application to a large-scale real-world combustion scenario
Overview of the considered rotating detonation rocket engine scenario
Acquiring the large-scale training dataset
Scalability results on the Frontera supercomputer
Predictions beyond the training time horizon
Conclusion

Key Result

Lemma 1

Let $\mathbf{A} \in \mathbb{R}^{q \times k}$ be a matrix and let $\mathbf{A}_i \in \mathbb{R}^{q_i \times k}$ with $i = 1, 2, \ldots, p$ be $p$ non-overlapping row blocks such that $\sum_{i=1}^p q_i = q$. Then, $\mathbf{A}^\top \mathbf{A} = \sum_{i=1}^p \mathbf{A}_i^\top \mathbf{A}_i$.

Figures (8)

Figure 1: Combustion chamber domain for the considered RDRE scenario. The left figure plots the structured mesh. The right figure plots an example pressure field showing the three dominant co-rotating waves.
Figure 2: Strong scaling results using between $p=32$ and $p=2,048$ cores on Frontera. The left plot shows the speed-ups and the right plot shows the percentages of the total CPU time corresponding to data loading, all data processing computations, learning the reduced operators with dOpInf, and communication overhead.
Figure 3: Weak scaling results using between $p=1$ and $p=2,048$ cores on Frontera. The left plot shows the obtained efficiency and the right plot shows the corresponding CPU times for data loading, all data processing computations, learning the reduced operators with dOpInf, and communication overhead.
Figure 4: The left figure plots the POD singular values of the transformed snapshots. The right figure plots the corresponding retained energy.
Figure 5: One-dimensional circumferential profiles for pressure. The columns plot the results at three representative locations close to the mid-channel. The rows plot the profiles at four representative time instants.
...and 3 more figures

Theorems & Definitions (6)

Remark 1
Lemma 1
proof
Remark 2
Remark 3
Remark 4

Distributed computing for physics-based data-driven reduced modeling at scale: Application to a rotating detonation rocket engine

TL;DR

Abstract

Distributed computing for physics-based data-driven reduced modeling at scale: Application to a rotating detonation rocket engine

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (6)