HoSZp: An Efficient Homomorphic Error-bounded Lossy Compressor for Scientific Data
Tripti Agarwal, Sheng Di, Jiajun Huang, Yafan Huang, Ganesh Gopalakrishnan, Robert Underwood, Kai Zhao, Xin Liang, Guanpeng Li, Franck Cappello
TL;DR
HoSZp introduces a homomorphic error-bounded lossy compressor that enables arithmetic on compressed scientific data without full decompression. It extends the CPU-based SZp pipeline with a lightweight three-stage process (Quantization, Decorrelation, Blockwise Fixed-length Encoding) and proves that univariate and multivariate operations on compressed data are homomorphic under the error bound $\\epsilon$. Extensive experiments on four real HPC datasets demonstrate substantial throughput gains (up to $2.08\\times$ in distributed RTM workloads) with competitive compression ratios, validating both performance and correctness. The approach reduces memory footprints and enables in-place computations for large-scale scientific workflows, with promising potential for additional homomorphic measures in future work.
Abstract
Error-bounded lossy compression has been a critical technique to significantly reduce the sheer amounts of simulation datasets for high-performance computing (HPC) scientific applications while effectively controlling the data distortion based on user-specified error bound. In many real-world use cases, users must perform computational operations on the compressed data (a.k.a. homomorphic compression). However, none of the existing error-bounded lossy compressors support the homomorphism, inevitably resulting in undesired decompression costs. In this paper, we propose a novel homomorphic error-bounded lossy compressor (called HoSZp), which supports not only error-bounding features but efficient computations (including negation, addition, multiplication, mean, variance, etc.) on the compressed data without the complete decompression step, which is the first attempt to the best of our knowledge. We develop several optimization strategies to maximize the overall compression ratio and execution performance. We evaluate HoSZp compared to other state-of-the-art lossy compressors based on multiple real-world scientific application datasets.
