Table of Contents
Fetching ...

Physics-Aware Compression of Plasma Distribution Functions with GPU-Accelerated Gaussian Mixture Models

Andong Hu, Luca Pennati, Ivy Peng, Stefano Markidis

TL;DR

The paper tackles the overwhelming data footprint of large-scale plasma PIC simulations by introducing a physics-aware in-situ compression method based on Gaussian Mixture Models. It preprocesses velocity distributions with histograms and fits a weighted GMM on reduced data, performing the computations on GPUs within the iPIC3D framework and using ADIOS 2 for flexible I/O. The approach preserves physically meaningful quantities (e.g., bulk velocity, temperature, beams) and enables real-time analysis, achieving compression ratios up to $10^4$ relative to raw data while maintaining low information loss as measured by $JSD$. Compared with standard compressors like SZ, ZFP, MGARD, and BLOSC2, the GMM-based method provides a more interpretable, physics-consistent representation and shows competitive performance with manageable processing overhead. The work demonstrates a practical path to substantial in-situ data reduction for large plasma simulations, with adaptive pruning and GPU acceleration as key enablers and potential for real-time deviation detection in evolving systems.

Abstract

Data compression is a critical technology for large-scale plasma simulations. Storing complete particle information requires Terabyte-scale data storage, and analysis requires ad-hoc scalable post-processing tools. We propose a physics-aware in-situ compression method using Gaussian Mixture Models (GMMs) to approximate electron and ion velocity distribution functions with a number of Gaussian components. This GMM-based method allows us to capture plasma features such as mean velocity and temperature, and it enables us to identify heating processes and generate beams. We first construct a histogram to reduce computational overhead and apply GPU-accelerated, in-situ GMM fitting within iPIC3D, a large-scale implicit Particle-in-Cell simulator, ensuring real-time compression. The compressed representation is stored using the ADIOS 2 library, thus optimizing the I/O process. The GPU and histogramming implementation provides a significant speed-up with respect to GMM on particles (both in time and required memory at run-time), enabling real-time compression. Compared to algorithms like SZ, MGARD, and BLOSC2, our GMM-based method has a physics-based approach, retaining the physical interpretation of plasma phenomena such as beam formation, acceleration, and heating mechanisms. Our GMM algorithm achieves a compression ratio of up to $10^4$, requiring a processing time comparable to, or even lower than, standard compression engines.

Physics-Aware Compression of Plasma Distribution Functions with GPU-Accelerated Gaussian Mixture Models

TL;DR

The paper tackles the overwhelming data footprint of large-scale plasma PIC simulations by introducing a physics-aware in-situ compression method based on Gaussian Mixture Models. It preprocesses velocity distributions with histograms and fits a weighted GMM on reduced data, performing the computations on GPUs within the iPIC3D framework and using ADIOS 2 for flexible I/O. The approach preserves physically meaningful quantities (e.g., bulk velocity, temperature, beams) and enables real-time analysis, achieving compression ratios up to relative to raw data while maintaining low information loss as measured by . Compared with standard compressors like SZ, ZFP, MGARD, and BLOSC2, the GMM-based method provides a more interpretable, physics-consistent representation and shows competitive performance with manageable processing overhead. The work demonstrates a practical path to substantial in-situ data reduction for large plasma simulations, with adaptive pruning and GPU acceleration as key enablers and potential for real-time deviation detection in evolving systems.

Abstract

Data compression is a critical technology for large-scale plasma simulations. Storing complete particle information requires Terabyte-scale data storage, and analysis requires ad-hoc scalable post-processing tools. We propose a physics-aware in-situ compression method using Gaussian Mixture Models (GMMs) to approximate electron and ion velocity distribution functions with a number of Gaussian components. This GMM-based method allows us to capture plasma features such as mean velocity and temperature, and it enables us to identify heating processes and generate beams. We first construct a histogram to reduce computational overhead and apply GPU-accelerated, in-situ GMM fitting within iPIC3D, a large-scale implicit Particle-in-Cell simulator, ensuring real-time compression. The compressed representation is stored using the ADIOS 2 library, thus optimizing the I/O process. The GPU and histogramming implementation provides a significant speed-up with respect to GMM on particles (both in time and required memory at run-time), enabling real-time compression. Compared to algorithms like SZ, MGARD, and BLOSC2, our GMM-based method has a physics-based approach, retaining the physical interpretation of plasma phenomena such as beam formation, acceleration, and heating mechanisms. Our GMM algorithm achieves a compression ratio of up to , requiring a processing time comparable to, or even lower than, standard compression engines.

Paper Structure

This paper contains 11 sections, 5 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: The distribution function of 10,000 particles can be approximated with an $64\times64$ histogram (central panel) or a GMM approach storing the weights, the center and covariance matrices of two Gaussians (rightmost panel).
  • Figure 2: Magnetic field line at the x-point during magnetic reconnection. The region where we take the data, with dimensions $L_x=L_y=1.42$$d_i$, $L_z=2$$d_i$ is highlighted in orange.
  • Figure 3: Electron $uv$ velocity pdf at the x-point location. Left panel: 2D pdf obtained from the histogram; central panel: 2D pdf reconstructed by GMM; right panel: absolute difference between the true and reconstructed pdf.
  • Figure 4: Electron $uv$ velocity pdf at the x-point location. 1D plots obtained as slices of the 2D pdf. In blue the true pdf obtained from the histograms, in orange the GMM reconstructed pdf.
  • Figure 5: Ion $uw$ velocity pdf at the x-point location. Left panel: 2D pdf obtained from the histogram; central panel: 2D pdf reconstructed by GMM; right panel: absolute difference between the true and reconstructed pdf.
  • ...and 3 more figures