Table of Contents
Fetching ...

Rate-Distortion Bounds for Heterogeneous Random Fields on Finite Lattices

Sujata Sinha, Vishwas Rao, Robert Underwood, David Lenz, Sheng Di, Franck Cappello, Lingjia Liu

TL;DR

A finite-blocklength rate-distortion framework for heterogeneous random fields on finite lattices is introduced, explicitly accounting for the tile-based architectures used in high-performance scientific compressors.

Abstract

Since Shannon's foundational work, rate-distortion theory has defined the fundamental limits of lossy compression. Classical results, derived for memoryless and stationary ergodic sources in the asymptotic regime, have shaped both transform and predictive coding architectures, as well as practical standards such as JPEG. Finite-blocklength refinements, initiated by the non-asymptotic achievability and converse bounds of Kostina and Verdu, provide precise characterizations under excess-distortion probability constraints, but primarily for memoryless or statistically homogeneous models. In contrast, error-bounded practical lossy compressors for scientific computing, such as SZ, ZFP, MGARD, and SPERR, are designed for finite, high-dimensional, spatially correlated, and statistically heterogeneous random fields. These compressors partition data into fixed-size tiles that are processed independently, making tile size a central architectural constraint. Structural heterogeneity, finite lattice effects, and tiling constraints are not addressed by existing finite-blocklength analyses. This paper introduces a finite-blocklength rate-distortion framework for heterogeneous random fields on finite lattices, explicitly accounting for the tile-based architectures used in high-performance scientific compressors. The field is modeled as piecewise homogeneous with regionwise stationary second-order statistics, and tiling constraints are incorporated directly into the source model. Under an excess-distortion probability criterion, we establish non-asymptotic achievability, converse bounds and derive a second-order expansion that quantifies the impact of spatial correlation, region geometry, heterogeneity, and tile size on the rate and dispersion.

Rate-Distortion Bounds for Heterogeneous Random Fields on Finite Lattices

TL;DR

A finite-blocklength rate-distortion framework for heterogeneous random fields on finite lattices is introduced, explicitly accounting for the tile-based architectures used in high-performance scientific compressors.

Abstract

Since Shannon's foundational work, rate-distortion theory has defined the fundamental limits of lossy compression. Classical results, derived for memoryless and stationary ergodic sources in the asymptotic regime, have shaped both transform and predictive coding architectures, as well as practical standards such as JPEG. Finite-blocklength refinements, initiated by the non-asymptotic achievability and converse bounds of Kostina and Verdu, provide precise characterizations under excess-distortion probability constraints, but primarily for memoryless or statistically homogeneous models. In contrast, error-bounded practical lossy compressors for scientific computing, such as SZ, ZFP, MGARD, and SPERR, are designed for finite, high-dimensional, spatially correlated, and statistically heterogeneous random fields. These compressors partition data into fixed-size tiles that are processed independently, making tile size a central architectural constraint. Structural heterogeneity, finite lattice effects, and tiling constraints are not addressed by existing finite-blocklength analyses. This paper introduces a finite-blocklength rate-distortion framework for heterogeneous random fields on finite lattices, explicitly accounting for the tile-based architectures used in high-performance scientific compressors. The field is modeled as piecewise homogeneous with regionwise stationary second-order statistics, and tiling constraints are incorporated directly into the source model. Under an excess-distortion probability criterion, we establish non-asymptotic achievability, converse bounds and derive a second-order expansion that quantifies the impact of spatial correlation, region geometry, heterogeneity, and tile size on the rate and dispersion.
Paper Structure (46 sections, 4 theorems, 89 equations, 2 figures, 2 algorithms)

This paper contains 46 sections, 4 theorems, 89 equations, 2 figures, 2 algorithms.

Key Result

Theorem 5.1

Fix arbitrary reproduction distributions $\{P_{\hat{X}_r}\}_{r\in\mathcal{R}}$ on $\mathbb{R}^{\mathcal{S}_r}$ and codebook sizes $\{M_r\}_{r\in\mathcal{R}}$. Then there exists an $(\mathcal{S},\{M_r\}_{r\in\mathcal{R}})$--code in the sense of Section sec:source_coding_framework such that the excess

Figures (2)

  • Figure 1: Bounds to $R(n,d, \epsilon)$ for piecewise homogeneous gaussian source with MSE distortion
  • Figure 2: Piecewise modeling and tile-aware finite-blocklength RD analysis of heterogeneous scientific fields. (a) Spatially heterogeneous NYX field. (b) Region partition defining a piecewise homogeneous Gaussian model. (c) Finite-blocklength RD bounds under classical homogeneous and piecewise models together with empirical compressor curves, demonstrating the impact of heterogeneity and tile granularity. (d) Zoomed-in version of (c) in the low-distortion regime, where ZFP operates, to enable a more detailed assessment of its empirical RD performance. (e) Tile-size scaling of piecewise RD bounds and SPERR performance, revealing structural correlation scales and the statistical–architectural trade-off governing compressibility. Note: The absence of portions of the empirical compressor curves (dashed lines) at certain distortion levels reflects feasibility limits intrinsic to the corresponding compressor configurations.

Theorems & Definitions (8)

  • Theorem 5.1: Achievability bound
  • proof
  • Theorem 5.2: Converse bound
  • proof
  • Theorem 6.1: Second-order asymptotics
  • proof
  • Lemma 1: Regionwise reduction
  • proof