Map-Assisted Remote-Sensing Image Compression at Extremely Low Bitrates
Yixuan Ye, Ce Wang, Wanjie Sun, Zhenzhong Chen
TL;DR
The paper addresses remote-sensing image compression at extremely low bitrates where standard codecs fail to preserve semantic structure. It introduces Map-Assisted Generative Compression (MAGC), a two-stage framework built around a pre-trained diffusion model with strong priors, using latent representation $z_0$ and compressed latent $\ ilde{z}$ together with vector maps $m$ processed by a semantic adapter. In stage one, a VAE-based latent compressor with hyperpriors and a SPADE conditioned transform provides implicit guidance; in stage two, a conditional diffusion model uses both implicit guidance from $\ ilde{z}$ and explicit guidance from $m$ to reconstruct semantically accurate images via a pre-trained SD decoder. Experiments on remote-sensing data show MAGC achieves superior perceptual quality (LPIPS, DISTS, FID, MUSIQ) and higher semantic segmentation performance (mIoU) at ultra-low bitrates, outperforming standard codecs and prior learning-based methods, with publicly available dataset and code.
Abstract
Remote-sensing (RS) image compression at extremely low bitrates has always been a challenging task in practical scenarios like edge device storage and narrow bandwidth transmission. Generative models including VAEs and GANs have been explored to compress RS images into extremely low-bitrate streams. However, these generative models struggle to reconstruct visually plausible images due to the highly ill-posed nature of extremely low-bitrate image compression. To this end, we propose an image compression framework that utilizes a pre-trained diffusion model with powerful natural image priors to achieve high-realism reconstructions. However, diffusion models tend to hallucinate small structures and textures due to the significant information loss at limited bitrates. Thus, we introduce vector maps as semantic and structural guidance and propose a novel image compression approach named Map-Assisted Generative Compression (MAGC). MAGC employs a two-stage pipeline to compress and decompress RS images at extremely low bitrates. The first stage maps an image into a latent representation, which is then further compressed in a VAE architecture to save bitrates and serves as implicit guidance in the subsequent diffusion process. The second stage conducts a conditional diffusion model to generate a visually pleasing and semantically accurate result using implicit guidance and explicit semantic guidance. Quantitative and qualitative comparisons show that our method outperforms standard codecs and other learning-based methods in terms of perceptual quality and semantic accuracy. The dataset and code will be publicly available at https://github.com/WHUyyx/MAGC.
