Table of Contents
Fetching ...

aPriori: a Python package to process direct numerical simulations

Lorenzo Piu, Heinz Pitsch, Alessandro Parente

Abstract

In the field of computational fluid dynamics, direct numerical simulations generate highly detailed data for the analysis of turbulent flows by resolving all relevant physical scales. Yet their large size, complexity, and heterogeneity make systematic post-processing and data reuse increasingly challenging. Despite the growing availability of high-fidelity simulations through public repositories, extracting meaningful physical insight often requires substantial technical effort, specialized workflows, and access to high-performance computing resources. In this article we introduce \texttt{aPriori}, an open-source Python package developed to address these limitations by providing a dedicated, memory-efficient, and user-oriented framework for the analysis of direct numerical simulation data. The software enables streamlined handling of three-dimensional fields, including filtering, scale separation, gradient evaluation, thermochemical analysis, and visualization, using concise and reproducible scripts. Its pointer-based data management strategy allows very large datasets to be processed on standard workstations without excessive memory usage, significantly lowering the barrier to advanced analysis. Beyond basic post-processing, \texttt{aPriori} supports workflows central to modern turbulence and combustion research, such as a priori model assessment, data-driven closure development, and detailed chemical analyses that include computational singular perturbation. By unifying these capabilities within a coherent and extensible software architecture, \texttt{aPriori} enhances productivity, promotes reproducibility, and facilitates broader and more effective use of high-fidelity simulation data within the computational fluid dynamics community.

aPriori: a Python package to process direct numerical simulations

Abstract

In the field of computational fluid dynamics, direct numerical simulations generate highly detailed data for the analysis of turbulent flows by resolving all relevant physical scales. Yet their large size, complexity, and heterogeneity make systematic post-processing and data reuse increasingly challenging. Despite the growing availability of high-fidelity simulations through public repositories, extracting meaningful physical insight often requires substantial technical effort, specialized workflows, and access to high-performance computing resources. In this article we introduce \texttt{aPriori}, an open-source Python package developed to address these limitations by providing a dedicated, memory-efficient, and user-oriented framework for the analysis of direct numerical simulation data. The software enables streamlined handling of three-dimensional fields, including filtering, scale separation, gradient evaluation, thermochemical analysis, and visualization, using concise and reproducible scripts. Its pointer-based data management strategy allows very large datasets to be processed on standard workstations without excessive memory usage, significantly lowering the barrier to advanced analysis. Beyond basic post-processing, \texttt{aPriori} supports workflows central to modern turbulence and combustion research, such as a priori model assessment, data-driven closure development, and detailed chemical analyses that include computational singular perturbation. By unifying these capabilities within a coherent and extensible software architecture, \texttt{aPriori} enhances productivity, promotes reproducibility, and facilitates broader and more effective use of high-fidelity simulation data within the computational fluid dynamics community.

Paper Structure

This paper contains 21 sections, 24 equations, 8 figures.

Figures (8)

  • Figure 1: Estimated size (a) CO$_2$ emissions (b) associated with Direct Numerical Simulation (DNS) studies published in Journal of Fluid Mechanics between 2004 and 2024. Each point represents a simulation reported in the literature, highlighting the rapid growth in computational cost---and therefore carbon footprint---of state-of-the-art DNS. The largest simulations exceed $10^3$ metric tons of CO$_2$, comparable to the emissions of a transcontinental commercial flight. Coloured lines show model predictions for channel-flow DNS at fixed friction Reynolds numbers $Re_\tau$, based on a $Re_\tau^4$ scaling of computational cost and a Moore’s-law-like improvement in hardware efficiency over time. Source: adapted from Yang2024.
  • Figure 2: Graphical overview of the software architecture and the relationships between its three core classes. The Mesh class stores the spatial coordinates and grid topology of the DNS domain, providing geometric information required for visualization and differential operations. Individual physical variables (e.g., velocity components, species mass fractions, pressure, temperature) are represented as Scalar objects, each of which handles memory-efficient access to the corresponding binary data files through a pointer-based mechanism. The Field class acts as a high-level container that aggregates all Scalar objects associated with the same mesh, dynamically loading the variables present in the dataset, which are added as attributes of the class. It provides unified methods for filtering, downsampling, gradient computation, thermochemical evaluations, and visualization.
  • Figure 3: Load time of the Scalar class operating in light mode, measured by reading binary DNS-like scalar fields of increasing size from disk. Each point corresponds to a single array whose dimensions were progressively scaled, resulting in memory footprints ranging from a few hundred kilobytes to several gigabytes. The logarithmic axes reveal a near-linear growth of read time with respect to array size, with the largest field tested (approximately $6.46\times10^8$ elements, 2.41 GB) requiring only 1.53 s to load. This benchmark illustrates the efficiency of the pointer-based data access strategy used in the software, enabling manipulation of very large DNS fields without exhausting the system's memory.
  • Figure 4: DNS mid-plane visualisations produced with the aPriori plotting utilities for two turbulent flame configurations. (a)--(d) Lifted hydrogen flame in a heated coflow: temperature, streamwise velocity $U_x$, strain-rate magnitude $\mathcal{S} = (2 S_{ij} S_{ij})^{1/2}$ on a logarithmic scale, and water vapour mass fraction $Y_{\mathrm{H_2O}}$. (e)--(h) Premixed methane/air flame: progress variable based on $\mathrm{CO_2}$, velocity magnitude $U$, progress-variable gradient $\lvert \nabla C \rvert$ (logarithmic scale), and formaldehyde mass fraction $Y_{\mathrm{CH_2O}}$.
  • Figure 5: Mid-plane visualization of the velocity field from the homogeneous isotropic turbulence DNS of Gauding et al. Gauding_2022. Each column corresponds to a different filter size. The top row shows the filtered velocity magnitude, while the bottom row presents the same filtered fields after downsampling.
  • ...and 3 more figures