Table of Contents
Fetching ...

RWZC: A Model-Driven Approach for Learning-based Robust Wyner-Ziv Coding

Yuxuan Shi, Shuo Shao, Yongpeng Wu, Wenjun Zhang, Merouane Debbah

TL;DR

This work tackles robust Wyner–Ziv coding for distributed image transmission when source correlation is non-stationary and only decoder-side information is available. It introduces an affine correlation model based on source state information (SSI) and a three-part RWZC framework: SSI-driven correlation estimation/decoupling, rate-adaptive joint source–channel coding, and SSI-aided reconstruction with warping-prediction. Key contributions include a PTL-based SSI-aware masking module, an entropy-model-based JSCC codec with rate adaptation, and a decoder that fuses warped side information and SSI to improve perceptual quality and robustness across datasets. Experiments on KITTI and UDIS-D show competitive objective metrics and clear perceptual gains, especially under non-stationary correlations, highlighting RWZC’s practical potential for bandwidth-constrained distributed imaging.

Abstract

In this paper, a novel learning-based Wyner-Ziv coding framework is considered under a distributed image transmission scenario, where the correlated source is only available at the receiver. Unlike other learnable frameworks, our approach demonstrates robustness to non-stationary source correlation, where the overlapping information between image pairs varies. Specifically, we first model the affine relationship between correlated images and leverage this model for learnable mask generation and rate-adaptive joint source-channel coding. Moreover, we also provide a warping-prediction network to remove the distortion from channel interference and affine transform. Intuitively, the observed performance improvement is largely due to focusing on the simple geometric relationship, rather than the complex joint distribution between the sources. Numerical results show that our framework achieves a 1.5 dB gain in PSNR and a 0.2 improvement in MS-SSIM, along with a significant superiority in perceptual metrics, compared to state-of-the-art methods when applied to real-world samples with non-stationary correlations.

RWZC: A Model-Driven Approach for Learning-based Robust Wyner-Ziv Coding

TL;DR

This work tackles robust Wyner–Ziv coding for distributed image transmission when source correlation is non-stationary and only decoder-side information is available. It introduces an affine correlation model based on source state information (SSI) and a three-part RWZC framework: SSI-driven correlation estimation/decoupling, rate-adaptive joint source–channel coding, and SSI-aided reconstruction with warping-prediction. Key contributions include a PTL-based SSI-aware masking module, an entropy-model-based JSCC codec with rate adaptation, and a decoder that fuses warped side information and SSI to improve perceptual quality and robustness across datasets. Experiments on KITTI and UDIS-D show competitive objective metrics and clear perceptual gains, especially under non-stationary correlations, highlighting RWZC’s practical potential for bandwidth-constrained distributed imaging.

Abstract

In this paper, a novel learning-based Wyner-Ziv coding framework is considered under a distributed image transmission scenario, where the correlated source is only available at the receiver. Unlike other learnable frameworks, our approach demonstrates robustness to non-stationary source correlation, where the overlapping information between image pairs varies. Specifically, we first model the affine relationship between correlated images and leverage this model for learnable mask generation and rate-adaptive joint source-channel coding. Moreover, we also provide a warping-prediction network to remove the distortion from channel interference and affine transform. Intuitively, the observed performance improvement is largely due to focusing on the simple geometric relationship, rather than the complex joint distribution between the sources. Numerical results show that our framework achieves a 1.5 dB gain in PSNR and a 0.2 improvement in MS-SSIM, along with a significant superiority in perceptual metrics, compared to state-of-the-art methods when applied to real-world samples with non-stationary correlations.
Paper Structure (26 sections, 11 equations, 17 figures, 1 table, 1 algorithm)

This paper contains 26 sections, 11 equations, 17 figures, 1 table, 1 algorithm.

Figures (17)

  • Figure 1: Illustration for image pairs of different correlation/parallax (from dataset UDIS-D), where parallax means the visual deviation between two images caused by different positions
  • Figure 2: An overview of the RWZC framework.
  • Figure 3: Feature matching based homography estimation, in which $(p_1,q_1)$ and $(p_2,q_2)$ are the matching points pair
  • Figure 4: Mask generation via homography mapping.
  • Figure 5: Fail cases of non-learnable masking on image pairs with: large parallax (First row); varying parallax (Second row)
  • ...and 12 more figures