Table of Contents
Fetching ...

Deep Lossless Image Compression via Masked Sampling and Coarse-to-Fine Auto-Regression

Tiantian Li, Qunbing Xia, Yue Li, Ruixiao Guo, Gaobo Yang

TL;DR

This work tackles lossless image compression by augmenting a strong lossy compressor with a residual coder that progressively encodes $R = X - \hat{X}$ through $T$ masked sampling iterations. Each iteration performs probability estimation, masking, and arithmetic coding, enabling coarse-to-fine, multi-directional autoregression that fuses contexts from surrounding directions beyond raster order. The approach supports tunable rate-latency via $T$, and uses a simple MSB/LSB residual decomposition with MSB encoded by RLE, while LSB is modeled by a neural predictor that leverages lossy priors and context from $\hat{X}$, spatial neighborhoods, and cross-channel information. Empirical results show competitive lossless compression across diverse datasets (e.g., Kodak and ImageNet64) with practical coding speeds, confirming the feasibility of scalable, single-model lossless coding that benefits from multi-directional context and progressive refinement.

Abstract

Learning-based lossless image compression employs pixel-based or subimage-based auto-regression for probability estimation, which achieves desirable performances. However, the existing works only consider context dependencies in one direction, namely, those symbols that appear before the current symbol in raster order. We believe that the dependencies between the current and future symbols should be further considered. In this work, we propose a deep lossless image compression via masked sampling and coarse-to-fine auto-regression. It combines lossy reconstruction and progressive residual compression, which fuses contexts from various directions and is more consistent with human perception. Specifically, the residuals are decomposed via $T$ iterative masked sampling, and each sampling consists of three steps: 1) probability estimation, 2) mask computation, and 3) arithmetic coding. The iterative process progressively refines our prediction and gradually presents a real image. Extensive experimental results show that compared with the existing traditional and learned lossless compression, our method achieves comparable compression performance on extensive datasets with competitive coding speed and more flexibility.

Deep Lossless Image Compression via Masked Sampling and Coarse-to-Fine Auto-Regression

TL;DR

This work tackles lossless image compression by augmenting a strong lossy compressor with a residual coder that progressively encodes through masked sampling iterations. Each iteration performs probability estimation, masking, and arithmetic coding, enabling coarse-to-fine, multi-directional autoregression that fuses contexts from surrounding directions beyond raster order. The approach supports tunable rate-latency via , and uses a simple MSB/LSB residual decomposition with MSB encoded by RLE, while LSB is modeled by a neural predictor that leverages lossy priors and context from , spatial neighborhoods, and cross-channel information. Empirical results show competitive lossless compression across diverse datasets (e.g., Kodak and ImageNet64) with practical coding speeds, confirming the feasibility of scalable, single-model lossless coding that benefits from multi-directional context and progressive refinement.

Abstract

Learning-based lossless image compression employs pixel-based or subimage-based auto-regression for probability estimation, which achieves desirable performances. However, the existing works only consider context dependencies in one direction, namely, those symbols that appear before the current symbol in raster order. We believe that the dependencies between the current and future symbols should be further considered. In this work, we propose a deep lossless image compression via masked sampling and coarse-to-fine auto-regression. It combines lossy reconstruction and progressive residual compression, which fuses contexts from various directions and is more consistent with human perception. Specifically, the residuals are decomposed via iterative masked sampling, and each sampling consists of three steps: 1) probability estimation, 2) mask computation, and 3) arithmetic coding. The iterative process progressively refines our prediction and gradually presents a real image. Extensive experimental results show that compared with the existing traditional and learned lossless compression, our method achieves comparable compression performance on extensive datasets with competitive coding speed and more flexibility.

Paper Structure

This paper contains 20 sections, 11 equations, 6 figures, 3 tables, 2 algorithms.

Figures (6)

  • Figure 1: Main components of the proposed framework RC via $T$ times masked sampling.
  • Figure 2: Detailed architecture of the residual probability estimation network $\Phi$ .
  • Figure 3: Latency analysis
  • Figure 4: Visualization of coarse-to-fine auto-regressive models. We show the context conditioned on at time $t$, i.e., the encoded/decoded symbols. Unknown symbols are replaced by white pixel dots.
  • Figure 5: Comparison results (scaled bpsp) for each component and each iteration of the Kodak dataset. The scaled bpsp is the total bits per iteration $t$ divided by the number of coded symbols per iteration $t$ rather than the total number of pixels $N$.
  • ...and 1 more figures