Table of Contents
Fetching ...

Foundation Model for Lossy Compression of Spatiotemporal Scientific Data

Xiao Li, Jaemoon Lee, Anand Rangarajan, Sanjay Ranka

TL;DR

This work tackles the challenge of lossy compression for high-dimensional, variable-physics spatiotemporal scientific data by introducing a foundation model that combines a hyperprior-augmented variational autoencoder with a super-resolution decoder. The approach extends VAEs to 3D to capture spatiotemporal correlations and employs a dedicated SR module in the decoder to enhance reconstruction quality, all while enforcing error guarantees through a block-based PCA residual bound. The FM demonstrates strong generalization to unseen domains and data shapes, achieving up to $4\times$ higher compression ratios after domain-specific fine-tuning and approximately $30\%$ additional gains from the SR component. This framework offers substantial reductions in storage and transmission costs for large-scale simulations while preserving data integrity, with practical implications for HPC workflows and scientific analytics.

Abstract

We present a foundation model (FM) for lossy scientific data compression, combining a variational autoencoder (VAE) with a hyper-prior structure and a super-resolution (SR) module. The VAE framework uses hyper-priors to model latent space dependencies, enhancing compression efficiency. The SR module refines low-resolution representations into high-resolution outputs, improving reconstruction quality. By alternating between 2D and 3D convolutions, the model efficiently captures spatiotemporal correlations in scientific data while maintaining low computational cost. Experimental results demonstrate that the FM generalizes well to unseen domains and varying data shapes, achieving up to 4 times higher compression ratios than state-of-the-art methods after domain-specific fine-tuning. The SR module improves compression ratio by 30 percent compared to simple upsampling techniques. This approach significantly reduces storage and transmission costs for large-scale scientific simulations while preserving data integrity and fidelity.

Foundation Model for Lossy Compression of Spatiotemporal Scientific Data

TL;DR

This work tackles the challenge of lossy compression for high-dimensional, variable-physics spatiotemporal scientific data by introducing a foundation model that combines a hyperprior-augmented variational autoencoder with a super-resolution decoder. The approach extends VAEs to 3D to capture spatiotemporal correlations and employs a dedicated SR module in the decoder to enhance reconstruction quality, all while enforcing error guarantees through a block-based PCA residual bound. The FM demonstrates strong generalization to unseen domains and data shapes, achieving up to higher compression ratios after domain-specific fine-tuning and approximately additional gains from the SR component. This framework offers substantial reductions in storage and transmission costs for large-scale simulations while preserving data integrity, with practical implications for HPC workflows and scientific analytics.

Abstract

We present a foundation model (FM) for lossy scientific data compression, combining a variational autoencoder (VAE) with a hyper-prior structure and a super-resolution (SR) module. The VAE framework uses hyper-priors to model latent space dependencies, enhancing compression efficiency. The SR module refines low-resolution representations into high-resolution outputs, improving reconstruction quality. By alternating between 2D and 3D convolutions, the model efficiently captures spatiotemporal correlations in scientific data while maintaining low computational cost. Experimental results demonstrate that the FM generalizes well to unseen domains and varying data shapes, achieving up to 4 times higher compression ratios than state-of-the-art methods after domain-specific fine-tuning. The SR module improves compression ratio by 30 percent compared to simple upsampling techniques. This approach significantly reduces storage and transmission costs for large-scale scientific simulations while preserving data integrity and fidelity.

Paper Structure

This paper contains 18 sections, 3 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Overview of the architecture of the foundation model (FM). 'Conv 2D/3D $\downarrow$' denotes a convolution operation with stride 2, while 'ConvTran 2D/3D $\uparrow$' denotes a transposed convolutional layer. 'MTD' refers to merging the temporal dimension, and 'STD' refers to splitting the temporal dimension. 'Q' represents rounding quantization. 'AE' and 'AD' denote arithmetic encoding and decoding, respectively. Leaky ReLU is used for nonlinearity.
  • Figure 2: The architecture of the BCB module.
  • Figure 3: FM evaluation on 5 variables of E3SM
  • Figure 4: Visualization of reconstructed data for our method, SZ3, and ZFP at the compression ratio of 100. The first row shows pressure data (PSL), and the second row shows temperature data (T200).
  • Figure 5: Adaptability to data dimensions
  • ...and 1 more figures