Table of Contents
Fetching ...

Exploiting Inter-Image Similarity Prior for Low-Bitrate Remote Sensing Image Compression

Junhui Li, Xingsong Hou

TL;DR

The paper addresses the challenge of low-bitrate remote sensing image compression by exploiting inter-image similarity priors. It introduces Code-RSIC, a decoding-end codebook framework that uses a pre-trained high-quality discrete codebook learned via VQGAN, a Transformer-based codebook lookup, and a hierarchical prior integration network to fuse priors with decoded features. The three-stage approach yields substantial gains in perceptual quality (lower FID and LPIPS) while maintaining competitive distortion metrics, demonstrating practical benefits for RS data transmission and storage. This method provides a flexible, retraining-free way to boost the perceptual fidelity of compressed RS imagery, particularly for texture-rich scenes, and highlights the potential of codebook priors in remote sensing applications.

Abstract

Deep learning-based methods have garnered significant attention in remote sensing (RS) image compression due to their superior performance. Most of these methods focus on enhancing the coding capability of the compression network and improving entropy model prediction accuracy. However, they typically compress and decompress each image independently, ignoring the significant inter-image similarity prior. In this paper, we propose a codebook-based RS image compression (Code-RSIC) method with a generated discrete codebook, which is deployed at the decoding end of a compression algorithm to provide inter-image similarity prior. Specifically, we first pretrain a high-quality discrete codebook using the competitive generation model VQGAN. We then introduce a Transformer-based prediction model to align the latent features of the decoded images from an existing compression algorithm with the frozen high-quality codebook. Finally, we develop a hierarchical prior integration network (HPIN), which mainly consists of Transformer blocks and multi-head cross-attention modules (MCMs) that can query hierarchical prior from the codebook, thus enhancing the ability of the proposed method to decode texture-rich RS images. Extensive experimental results demonstrate that the proposed Code-RSIC significantly outperforms state-of-the-art traditional and learning-based image compression algorithms in terms of perception quality. The code will be available at \url{https://github.com/mlkk518/Code-RSIC/

Exploiting Inter-Image Similarity Prior for Low-Bitrate Remote Sensing Image Compression

TL;DR

The paper addresses the challenge of low-bitrate remote sensing image compression by exploiting inter-image similarity priors. It introduces Code-RSIC, a decoding-end codebook framework that uses a pre-trained high-quality discrete codebook learned via VQGAN, a Transformer-based codebook lookup, and a hierarchical prior integration network to fuse priors with decoded features. The three-stage approach yields substantial gains in perceptual quality (lower FID and LPIPS) while maintaining competitive distortion metrics, demonstrating practical benefits for RS data transmission and storage. This method provides a flexible, retraining-free way to boost the perceptual fidelity of compressed RS imagery, particularly for texture-rich scenes, and highlights the potential of codebook priors in remote sensing applications.

Abstract

Deep learning-based methods have garnered significant attention in remote sensing (RS) image compression due to their superior performance. Most of these methods focus on enhancing the coding capability of the compression network and improving entropy model prediction accuracy. However, they typically compress and decompress each image independently, ignoring the significant inter-image similarity prior. In this paper, we propose a codebook-based RS image compression (Code-RSIC) method with a generated discrete codebook, which is deployed at the decoding end of a compression algorithm to provide inter-image similarity prior. Specifically, we first pretrain a high-quality discrete codebook using the competitive generation model VQGAN. We then introduce a Transformer-based prediction model to align the latent features of the decoded images from an existing compression algorithm with the frozen high-quality codebook. Finally, we develop a hierarchical prior integration network (HPIN), which mainly consists of Transformer blocks and multi-head cross-attention modules (MCMs) that can query hierarchical prior from the codebook, thus enhancing the ability of the proposed method to decode texture-rich RS images. Extensive experimental results demonstrate that the proposed Code-RSIC significantly outperforms state-of-the-art traditional and learning-based image compression algorithms in terms of perception quality. The code will be available at \url{https://github.com/mlkk518/Code-RSIC/
Paper Structure (27 sections, 14 equations, 10 figures, 2 tables)

This paper contains 27 sections, 14 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Illustration of the difference between existing image compression algorithms and the proposed method. Here the role of the codebook is to provide inter-image similarity prior to the decoding end of the compressor.
  • Figure 2: Visualization of inter-image similarity prior between two images.
  • Figure 3: Framework of the proposed Code-RSIC. We first learn a discrete codebook $\mathbf{C}$ and the decoder $D_H$ to store high-quality visual parts of RS images via self-reconstruction learning. Then with frozen codebook and decoder $D_H$, we introduce a Transformer module zhou2022towards for code sequence prediction, modeling the global RS image composition of decoded image from the compressor ELIC he2022elic. Besides, HPIN is developed to bridge the information flow from LQ encoder $E_L$ to decoder $D_H$ and query prior information from the frozen $D_H$.
  • Figure 4: Some cropped images from the training set of the LoveDA dataset.
  • Figure 5: Rate-distortion and rate-perception curves of several image compression algorithms on the testing set of LoveDA.
  • ...and 5 more figures