Table of Contents
Fetching ...

Unlocking massively parallel spectral proper orthogonal decompositions in the PySPOD package

Marcin Rogowski, Brandon C. Y. Yeung, Oliver T. Schmidt, Romit Maulik, Lisandro Dalcin, Matteo Parsani, Gianmarco Mengaldo

TL;DR

The paper introduces PySPOD, a parallel (MPI-distributed) implementation of spectral proper orthogonal decomposition (SPOD) that distributes the spatial dimension while preserving time-domain Fourier transforms to avoid I/O bottlenecks. It provides a complete SPOD formulation, practical Welch-block-based data handling, and a two-phase I/O strategy, enabling analysis of datasets up to hundreds of terabytes. Strong and weak scalability analyses on LES jet data and ERA5 reanalysis demonstrate competitive I/O bandwidth, efficient FFT computations, and scalable mode extraction, illustrating the method's capacity to uncover previously inaccessible spatio-temporal patterns. The work delivers an open-source toolchain with applications in fluid dynamics and geophysics, advancing modal analyses of big quasi-stationary data and opening paths to new physical insights.

Abstract

We propose a parallel (distributed) version of the spectral proper orthogonal decomposition (SPOD) technique. The parallel SPOD algorithm distributes the spatial dimension of the dataset preserving time. This approach is adopted to preserve the non-distributed fast Fourier transform of the data in time, thereby avoiding the associated bottlenecks. The parallel SPOD algorithm is implemented in the PySPOD (https://github.com/MathEXLab/PySPOD) library and makes use of the standard message passing interface (MPI) library, implemented in Python via mpi4py (https://mpi4py.readthedocs.io/en/stable/). An extensive performance evaluation of the parallel package is provided, including strong and weak scalability analyses. The open-source library allows the analysis of large datasets of interest across the scientific community. Here, we present applications in fluid dynamics and geophysics, that are extremely difficult (if not impossible) to achieve without a parallel algorithm. This work opens the path toward modal analyses of big quasi-stationary data, helping to uncover new unexplored spatio-temporal patterns.

Unlocking massively parallel spectral proper orthogonal decompositions in the PySPOD package

TL;DR

The paper introduces PySPOD, a parallel (MPI-distributed) implementation of spectral proper orthogonal decomposition (SPOD) that distributes the spatial dimension while preserving time-domain Fourier transforms to avoid I/O bottlenecks. It provides a complete SPOD formulation, practical Welch-block-based data handling, and a two-phase I/O strategy, enabling analysis of datasets up to hundreds of terabytes. Strong and weak scalability analyses on LES jet data and ERA5 reanalysis demonstrate competitive I/O bandwidth, efficient FFT computations, and scalable mode extraction, illustrating the method's capacity to uncover previously inaccessible spatio-temporal patterns. The work delivers an open-source toolchain with applications in fluid dynamics and geophysics, advancing modal analyses of big quasi-stationary data and opening paths to new physical insights.

Abstract

We propose a parallel (distributed) version of the spectral proper orthogonal decomposition (SPOD) technique. The parallel SPOD algorithm distributes the spatial dimension of the dataset preserving time. This approach is adopted to preserve the non-distributed fast Fourier transform of the data in time, thereby avoiding the associated bottlenecks. The parallel SPOD algorithm is implemented in the PySPOD (https://github.com/MathEXLab/PySPOD) library and makes use of the standard message passing interface (MPI) library, implemented in Python via mpi4py (https://mpi4py.readthedocs.io/en/stable/). An extensive performance evaluation of the parallel package is provided, including strong and weak scalability analyses. The open-source library allows the analysis of large datasets of interest across the scientific community. Here, we present applications in fluid dynamics and geophysics, that are extremely difficult (if not impossible) to achieve without a parallel algorithm. This work opens the path toward modal analyses of big quasi-stationary data, helping to uncover new unexplored spatio-temporal patterns.
Paper Structure (19 sections, 17 equations, 9 figures)

This paper contains 19 sections, 17 equations, 9 figures.

Figures (9)

  • Figure 1: Schematic of the parallel SPOD algorithm. The key aspect is to obtain an appropriate data decomposition layout that allows preserving all time operations as done in serial (i.e., the DFT), and decompose only the spatial dimensions of the data. Once the data is in the required parallel layout, the parallelization of the SPOD algorithm becomes trivial and consists only of a parallel reduction (step 4) of the inner product (step 3).
  • Figure 2: Instantaneous flow field of the twin-rectangular jet: (a) Q-criterion isosurface, colored by pressure; (b) numerical schlieren on the $y=0$ and $z=-1.8$ planes, with contours of mean streamwise velocity on the $x\in\{ 8,16 \}$ planes.
  • Figure 3: Premultiplied SPOD eigenvalue spectra of the twin-rectangular jet. The spectra fade from black to white with increasing mode number. Modes corresponding to the highlighted ($\bullet$) eigenvalues at $f=0.21$ are reported in figure \ref{['fig:jetF0011Mode']}.
  • Figure 4: SPOD modes of the twin-rectangular jet at frequency $f=0.21$: (a,b) mode 1; (c,d) mode 2; (e,f) mode 3; (g,h) mode 4. Isosurfaces of $\mathrm{Re}\{\vb*{\phi}_p\}=\pm 0.0005$ are shown, along with cross-sections at $z=1.3$ (left column) and $y=-0.25$ (right column). The corresponding SPOD eigenvalues are highlighted in figure \ref{['fig:jetSpec']}.
  • Figure 5: U component of the wind velocity for pressure (i.e., vertical) levels 1 (top left), 12 (top right), 24 (bottom left), and 37 (bottom right), at midnight (00:00) on the 1st of January 2010. Level 1 corresponds to a pressure of 1 millibars, level 12 to 125 millibars, level 24 to 600 millibars, and level 37 to 1000 millibars.
  • ...and 4 more figures