Table of Contents
Fetching ...

An octree-based sampling algorithm for analyzing big simulation data

Janis Geise, Sebastian Spinner, Richard Semaan, Andre Weiner

TL;DR

The paper tackles the data bottleneck in post-processing big CFD simulations by introducing an improved Sparse Spatial Sampling (S^3) method that builds a time-invariant octree grid via metric-driven, gain-based adaptive refinement. A user-defined metric guides refinement, and data are interpolated onto the octree for efficient Post-processing and modal analysis (via SVD/POD). Across three challenging flows, the method achieves large mesh reductions (35%–98%) while preserving dominant flow structures, as validated by near-identical POD modes and singular values compared to the original data. The approach enables local workstation post-processing and offers potential runtime integration, though it currently lacks distributed HPC parallelism and is less suited for surface-only data. Overall, S^3 provides a practical, physics-informed downsampling tool to mitigate data-storage and analysis bottlenecks in CFD workflows.

Abstract

As computational resources continue to increase, the storage and analysis of vast amounts of data will inevitably become a bottleneck in computational fluid dynamics (CFD) and related fields. Although compression algorithms and efficient data formats can mitigate this issue, they are often insufficient when post-processing large amounts of volume data. Processing such data may require additional high-performance software and resources, or it may restrict the analysis to shorter time series or smaller regions of interest. The present work proposes an improved version of the existing \emph{Sparse Spatial Sampling} algorithm ($S^3$) to reduce the data from time-dependent flow simulations. The $S^3$ algorithm iteratively generates a time-invariant octree grid based on a user-defined metric, efficiently down-sampling the data while aiming to preserve as much of the metric as possible. Using the sampled grid allows for more efficient post-processing and enables memory-intensive tasks, such as computing the modal decomposition of flow snapshots. The enhanced version of $S^3$ is tested and evaluated on the scale-resolving simulations of the flow past a tandem configuration of airfoils in the transonic regime, the incompressible turbulent flow past a circular cylinder, and the flow around an aircraft half-model at high Reynolds and Mach numbers. $S^3$ significantly reduces the number of mesh cells by $35 \%$ to $98\%$ for all test cases while accurately preserving the dominant flow dynamics, enabling post-processing of CFD data on a local workstation rather than HPC resources for many cases.

An octree-based sampling algorithm for analyzing big simulation data

TL;DR

The paper tackles the data bottleneck in post-processing big CFD simulations by introducing an improved Sparse Spatial Sampling (S^3) method that builds a time-invariant octree grid via metric-driven, gain-based adaptive refinement. A user-defined metric guides refinement, and data are interpolated onto the octree for efficient Post-processing and modal analysis (via SVD/POD). Across three challenging flows, the method achieves large mesh reductions (35%–98%) while preserving dominant flow structures, as validated by near-identical POD modes and singular values compared to the original data. The approach enables local workstation post-processing and offers potential runtime integration, though it currently lacks distributed HPC parallelism and is less suited for surface-only data. Overall, S^3 provides a practical, physics-informed downsampling tool to mitigate data-storage and analysis bottlenecks in CFD workflows.

Abstract

As computational resources continue to increase, the storage and analysis of vast amounts of data will inevitably become a bottleneck in computational fluid dynamics (CFD) and related fields. Although compression algorithms and efficient data formats can mitigate this issue, they are often insufficient when post-processing large amounts of volume data. Processing such data may require additional high-performance software and resources, or it may restrict the analysis to shorter time series or smaller regions of interest. The present work proposes an improved version of the existing \emph{Sparse Spatial Sampling} algorithm () to reduce the data from time-dependent flow simulations. The algorithm iteratively generates a time-invariant octree grid based on a user-defined metric, efficiently down-sampling the data while aiming to preserve as much of the metric as possible. Using the sampled grid allows for more efficient post-processing and enables memory-intensive tasks, such as computing the modal decomposition of flow snapshots. The enhanced version of is tested and evaluated on the scale-resolving simulations of the flow past a tandem configuration of airfoils in the transonic regime, the incompressible turbulent flow past a circular cylinder, and the flow around an aircraft half-model at high Reynolds and Mach numbers. significantly reduces the number of mesh cells by to for all test cases while accurately preserving the dominant flow dynamics, enabling post-processing of CFD data on a local workstation rather than HPC resources for many cases.

Paper Structure

This paper contains 12 sections, 7 equations, 16 figures, 1 table, 1 algorithm.

Figures (16)

  • Figure 1: The three main steps of $S^3$. The stopping criterion is either the maximum number of leaf cells $N_{\ell,\max}$ or the minimum percentage of the original metric that must be captured. The depicted test case represents a generic tandem configuration of an ONERA OAT15A airfoil (front) and a NACA64A110 airfoil (rear).
  • Figure 2: The left column shows a comparison of the original grid (\ref{['fig:original_grid_oat']}) and the grid generated by $S^3$ for a captured metric of $\mathcal{M}_\mathrm{approx} = || \widehat{\boldsymbol{\mathcal{M}}}|| / ||\boldsymbol{\mathcal{M}}|| = 0.75$ (\ref{['fig:interpolated_grid_oat']}). The right column depicts the metric on the original grid (\ref{['fig:original_metric_oat']}) and the metric interpolated onto the $S^3$-grid (\ref{['fig:interpolated_metric_oat']}). Both contour plots are scaled from zero to $||\boldsymbol{\mathcal{M}}||_\infty$.
  • Figure 3: Temporal mean (left) and standard deviation (right) of the absolute spatial error $\Delta \mathbf{Ma}_n = |\mathbf{Ma}^\ast_n - \mathbf{Ma}_n|$ scaled with the free stream Mach number $Ma_{\infty}$ for $\mathcal{M}_{\mathrm{approx}} = 0.75$.
  • Figure 4: Comparison of the leading POD modes and the associated singular values for the tandem configuration. The left column shows the POD modes on the original grid, while the right column depicts the POD modes on the grid generated with $S^3$ and $\mathcal{M}_{\mathrm{approx}} = 0.27$. The colorscale is identical for all contours and bounded by $\pm||\mathbf{U}||_\infty$.
  • Figure 5: Comparison of the leading right-singular vectors for the tandem configuration. As for the modes, only every second mode coefficient is shown.
  • ...and 11 more figures