Table of Contents
Fetching ...

Error Analysis of ZFP Compression for Floating-Point Data

James Diffenderfer, Alyson Fox, Jeffrey Hittinger, Geoffrey Sanders, Peter Lindstrom

TL;DR

This work provides a rigorous forward-error analysis for ZFP lossy compression of floating-point data by constructing bit-vector spaces and per-step operators that mirror the ZFP pipeline. It establishes a concrete forward-error bound for fixed-precision mode, and extends the bound to fixed-accuracy and fixed-rate modes, quantifying how the number of kept bit-planes and the block-dimension affect error via the constant $K_{\\beta}$. The analysis is complemented by numerical experiments on synthetic $4^d$ blocks and real 3D fields (e.g., Rayleigh–Taylor viscosity-density data), showing the theoretical bounds accurately bound observed errors and offering guidance for parameter choices in HPC workflows. Overall, the paper enables principled use of ZFP in iterative simulations by bounding compression-induced errors and clarifying how to select $\\beta$, block size, and precision to achieve a desired accuracy.

Abstract

Compression of floating-point data will play an important role in high-performance computing as data bandwidth and storage become dominant costs. Lossy compression of floating-point data is powerful, but theoretical results are needed to bound its errors when used to store look-up tables, simulation results, or even the solution state during the computation. \black{In this paper, we analyze the round-off error introduced by ZFP, a %state-of-the-art lossy compression algorithm.} The stopping criteria for ZFP depends on the compression mode specified by the user; either fixed rate, fixed accuracy, or fixed precision [P. Lindstrom, Fixed-rate compressed floating-point arrays, IEEE Transactions on Visualization and Computer Graphics, 2014]. While most of our discussion is focused on the fixed precision mode of ZFP, we establish a bound on the error introduced by all three compression modes. In order to tightly capture the error, we first introduce a vector space that allows us to work with binary representations of components. Under this vector space, we define operators that implement each step of the ZFP compression and decompression to establish a bound on the error caused by ZFP. To conclude, numerical tests are provided to demonstrate the accuracy of the established bounds.

Error Analysis of ZFP Compression for Floating-Point Data

TL;DR

This work provides a rigorous forward-error analysis for ZFP lossy compression of floating-point data by constructing bit-vector spaces and per-step operators that mirror the ZFP pipeline. It establishes a concrete forward-error bound for fixed-precision mode, and extends the bound to fixed-accuracy and fixed-rate modes, quantifying how the number of kept bit-planes and the block-dimension affect error via the constant . The analysis is complemented by numerical experiments on synthetic blocks and real 3D fields (e.g., Rayleigh–Taylor viscosity-density data), showing the theoretical bounds accurately bound observed errors and offering guidance for parameter choices in HPC workflows. Overall, the paper enables principled use of ZFP in iterative simulations by bounding compression-induced errors and clarifying how to select , block size, and precision to achieve a desired accuracy.

Abstract

Compression of floating-point data will play an important role in high-performance computing as data bandwidth and storage become dominant costs. Lossy compression of floating-point data is powerful, but theoretical results are needed to bound its errors when used to store look-up tables, simulation results, or even the solution state during the computation. \black{In this paper, we analyze the round-off error introduced by ZFP, a %state-of-the-art lossy compression algorithm.} The stopping criteria for ZFP depends on the compression mode specified by the user; either fixed rate, fixed accuracy, or fixed precision [P. Lindstrom, Fixed-rate compressed floating-point arrays, IEEE Transactions on Visualization and Computer Graphics, 2014]. While most of our discussion is focused on the fixed precision mode of ZFP, we establish a bound on the error introduced by all three compression modes. In order to tightly capture the error, we first introduce a vector space that allows us to work with binary representations of components. Under this vector space, we define operators that implement each step of the ZFP compression and decompression to establish a bound on the error caused by ZFP. To conclude, numerical tests are provided to demonstrate the accuracy of the established bounds.

Paper Structure

This paper contains 26 sections, 17 theorems, 78 equations, 11 figures, 3 tables.

Key Result

Lemma 3.2

\newlabelFieldLemma0 $(\mathcal{B}, \oplus_\mathcal{B}, \odot_\mathcal{B})$ and $(\mathcal{N}, \oplus_\mathcal{N}, \odot_\mathcal{N})$ are fields with additive and multiplicative identities $0_{\mathcal{B}} := (0,0_{\mathcal{C}})$ and $1_{\mathcal{B}} := (0, 1_{\mathcal{C}})$ and $0_{\mathcal{N}} :

Figures (11)

  • Figure 1: 2-d Example with single precision: componentwise relative error (top) and block relative error (bottom) with respect to the precision parameter ($\beta$) for $e_{max}-e_{min} \in \{ 0,7,14\}$. The blue band represents the sampled maximum and minimum error, the red line depicts the theoretical bound, and the dashed green line represents the asymptotic behavior of the theoretical bound.
  • Figure 1: Contour plot of $\log_{10} (K_\beta)$ for $\beta \in [1, 64]$ and dimension $d \in [1, 5]$ with $k = 53$ and $q = 62$.
  • Figure 2: 2-d Example with single precision: componentwise relative error (top) and block relative error (bottom) with respect to the difference in exponents ($e_{max}-e_{min}$) for $\beta \in \{12,22,32\}$. The blue band represents the sampled maximum and minimum error and the red line depicts the theoretical bound.
  • Figure 3: 2-d Example with double precision: componentwise relative error (top) and block relative error (bottom) with respect to the precision parameter ($\beta$) for $e_{max}-e_{min} \in \{0,7,14\}$. The blue band represents the sampled maximum and minimum error, the red line depicts the theoretical bound, and the dashed green line represents the asymptotic behavior of the theoretical bound.
  • Figure 4: 2-d Example with double precision: componentwise relative error (top) and block relative error (bottom) with respect to the difference in exponents ($e_{max}-e_{min}$) for $\beta \in \{32,48,64\}$. The blue band represents the sampled maximum and minimum error and the red line depicts the theoretical bound.
  • ...and 6 more figures

Theorems & Definitions (33)

  • Definition 3.1
  • Lemma 3.2
  • Lemma 3.3
  • Definition 3.4
  • Definition 3.5
  • Lemma 3.6
  • Proof 1
  • Lemma 3.7
  • Proof 2
  • Lemma 3.8
  • ...and 23 more