Table of Contents
Fetching ...

Distributed physics-informed neural networks via domain decomposition for fast flow reconstruction

Yixiao Qian, Jiaxu Liu, Zewei Xia, Song Chen, Chao Xu, Shengze Cai

TL;DR

This work tackles the challenge of reconstructing high-resolution flow fields from sparse velocity measurements by introducing a scalable distributed PINN framework that partitions the spatiotemporal domain into subdomains with local experts. A key innovation is the Reference Anchor Normalization with Decoupled Asymmetric Weighting, which resolves pressure gauge indeterminacy across interfaces while maintaining temporal continuity. The authors implement a hardware-efficient training pipeline using CUDA Graphs and JIT compilation to mitigate Python overhead in high-order derivative evaluations. Numerical experiments on 2D steady cavity, 2D unsteady cylinder wake, and 3D unsteady cylinder wake demonstrate near-linear strong scaling and improved reconstruction fidelity as domain decomposition increases, validating both the method and its potential for large-scale flow reconstruction.

Abstract

Physics-Informed Neural Networks (PINNs) offer a powerful paradigm for flow reconstruction, seamlessly integrating sparse velocity measurements with the governing Navier-Stokes equations to recover complete velocity and latent pressure fields. However, scaling such models to large spatiotemporal domains is hindered by computational bottlenecks and optimization instabilities. In this work, we propose a robust distributed PINNs framework designed for efficient flow reconstruction via spatiotemporal domain decomposition. A critical challenge in such distributed solvers is pressure indeterminacy, where independent sub-networks drift into inconsistent local pressure baselines. We address this issue through a reference anchor normalization strategy coupled with decoupled asymmetric weighting. By enforcing a unidirectional information flow from designated master ranks where the anchor point lies to neighboring ranks, our approach eliminates gauge freedom and guarantees global pressure uniqueness while preserving temporal continuity. Furthermore, to mitigate the Python interpreter overhead associated with computing high-order physics residuals, we implement a high-performance training pipeline accelerated by CUDA graphs and JIT compilation. Extensive validation on complex flow benchmarks demonstrates that our method achieves near-linear strong scaling and high-fidelity reconstruction, establishing a scalable and physically rigorous pathway for flow reconstruction and understanding of complex hydrodynamics.

Distributed physics-informed neural networks via domain decomposition for fast flow reconstruction

TL;DR

This work tackles the challenge of reconstructing high-resolution flow fields from sparse velocity measurements by introducing a scalable distributed PINN framework that partitions the spatiotemporal domain into subdomains with local experts. A key innovation is the Reference Anchor Normalization with Decoupled Asymmetric Weighting, which resolves pressure gauge indeterminacy across interfaces while maintaining temporal continuity. The authors implement a hardware-efficient training pipeline using CUDA Graphs and JIT compilation to mitigate Python overhead in high-order derivative evaluations. Numerical experiments on 2D steady cavity, 2D unsteady cylinder wake, and 3D unsteady cylinder wake demonstrate near-linear strong scaling and improved reconstruction fidelity as domain decomposition increases, validating both the method and its potential for large-scale flow reconstruction.

Abstract

Physics-Informed Neural Networks (PINNs) offer a powerful paradigm for flow reconstruction, seamlessly integrating sparse velocity measurements with the governing Navier-Stokes equations to recover complete velocity and latent pressure fields. However, scaling such models to large spatiotemporal domains is hindered by computational bottlenecks and optimization instabilities. In this work, we propose a robust distributed PINNs framework designed for efficient flow reconstruction via spatiotemporal domain decomposition. A critical challenge in such distributed solvers is pressure indeterminacy, where independent sub-networks drift into inconsistent local pressure baselines. We address this issue through a reference anchor normalization strategy coupled with decoupled asymmetric weighting. By enforcing a unidirectional information flow from designated master ranks where the anchor point lies to neighboring ranks, our approach eliminates gauge freedom and guarantees global pressure uniqueness while preserving temporal continuity. Furthermore, to mitigate the Python interpreter overhead associated with computing high-order physics residuals, we implement a high-performance training pipeline accelerated by CUDA graphs and JIT compilation. Extensive validation on complex flow benchmarks demonstrates that our method achieves near-linear strong scaling and high-fidelity reconstruction, establishing a scalable and physically rigorous pathway for flow reconstruction and understanding of complex hydrodynamics.
Paper Structure (10 sections, 15 equations, 9 figures, 4 tables, 1 algorithm)

This paper contains 10 sections, 15 equations, 9 figures, 4 tables, 1 algorithm.

Figures (9)

  • Figure 1: Illustration of the domain decomposition and distributed training architecture. (a) Spatial domain decomposition in a fixed time interval: the global domain $\Omega$ is partitioned into four sub-domains, $\Omega_1^{\text{int}}$ through $\Omega_4^{\text{int}}$. The ghost layer $\Omega_1^{\text{gh}} = \Omega_{1-2}^{\text{gh}} \cup \Omega_{1-3}^{\text{gh}}$ is introduced by extending $\Omega_1^{\text{int}}$ to enforce continuity across the interfaces. The anchor point $\mathbf{x}_{\text{anc}}$ identifies the master rank (here $\Omega_1$), and other sub-domains align their pressure gauge to this reference. (b) Distributed PINN training: both master and slave ranks learn $\mathcal{NN}:(\mathbf{x},t)\mapsto(\mathbf{u},p)$, and the interior losses $\mathcal{L}_{\text{obs}}$ and $\mathcal{L}_{\text{PDE}}$ are computed using the raw network outputs (thus independent of the pressure gauge). Anchor normalization is applied only when the master transmits ghost-layer pressure to its neighbors, using $\tilde{p}=p(\mathbf{x},t)-p(\mathbf{x}_{\text{anc}},t)$. For stability, the master ignores the spatial ghost pressure loss within the same time interval (i.e., $\lambda_{\text{gh\_p}}^{\text{space}}=0$ for master ranks), while retaining the temporal ghost pressure loss to preserve continuity across consecutive time intervals.
  • Figure 2: Reconstruction results for the 2D steady lid-driven cavity flow. The panels from top to bottom report the velocity components $(u,v)$ and pressure $p$: reference fields, the $P=1$ (single-domain) PINNs predictions, the corresponding $P=1$ errors, the $P=4$ distributed PINNs predictions with a $2\times2$ spatial decomposition, and the corresponding $P=4$ errors. The "$\times$" markers overlaid in the reference $u$ and $v$ panels indicate the locations of the sparse observation points used for training. These results illustrate that both $P=1$ and $P=4$ accurately reconstruct the flow fields.
  • Figure 3: Loss trajectories for the 2D steady cavity flow comparing $P=1$ (single-domain PINN) and $P=4$ ($2\times2$ decomposition). In the $P=1$ case, the two reported curves correspond to the observation loss and the PDE residual loss. In the $P=4$ case, the plotted losses are averaged over the four processes and additionally include the ghost/interface losses for velocity (ghost $u$ loss) and pressure (ghost $p$ loss). The final observation and PDE losses of $P=4$ are lower than those of the $P=1$ baseline, indicating that domain decomposition can effectively improve the overall accuracy.
  • Figure 4: Reconstruction results for the 2D unsteady cylinder wake at $t = 3.75$. The panels compare the reference fields with the $P=1$ (single-domain) PINNs predictions and errors, as well as the $P=4$ distributed PINNs predictions and errors. Both $P=1$ and $P=4$ achieve small errors, and the reconstructed $(u,v,p)$ fields remain continuous across the sub-domain interfaces. The results show that the domain decomposition strategy significantly reduces the reconstruction error in the downstream region (the right half). While the $P=1$ struggles to capture the complex wake dynamics uniformly, the distributed approach ($P=4$) achieves a much lower error magnitude in the rear sub-domain.
  • Figure 5: Loss trajectories for the 2D unsteady cylinder flow under strong scaling configurations ($P=1,2,4,8$). Across all decompositions, the loss terms decrease monotonically and robustly, with only minor oscillations during the initial training phase. Notably, finer domain decompositions (larger $P$) achieve superior convergence, reaching lower terminal loss values.
  • ...and 4 more figures