Table of Contents
Fetching ...

DiffCom: Channel Received Signal is a Natural Condition to Guide Diffusion Posterior Sampling

Sixian Wang, Jincheng Dai, Kailin Tan, Xiaoqi Qin, Kai Niu, Ping Zhang

TL;DR

This work tackles the perceptual realism and robustness gaps in traditional rate-distortion-optimized end-to-end transmission by introducing DiffCom, which uses the channel-received signal as a fine-grained condition to guide diffusion posterior sampling from pre-trained generative priors. By combining unconditional diffusion models with a likelihood-driven posterior, DiffCom preserves the data distribution while ensuring consistency with the noisy channel, yielding realistic reconstructions even under unseen degradations. The authors further enhance practicality with HiFi-DiffCom for sampling efficiency and Blind-DiffCom for pilot-free scenarios, achieving superior realism (FID) and robust performance across diverse wireless conditions. Together, these methods demonstrate a scalable, robust framework for generative communications that generalizes better than deterministic decoders and opens pathways for perceptual optimization in wireless systems.

Abstract

End-to-end visual communication systems typically optimize a trade-off between channel bandwidth costs and signal-level distortion metrics. However, under challenging physical conditions, this traditional coding and transmission paradigm often results in unrealistic reconstructions with perceptible blurring and aliasing artifacts, despite the inclusion of perceptual or adversarial losses for optimizing. This issue primarily stems from the receiver's limited knowledge about the underlying data manifold and the use of deterministic decoding mechanisms. To address these limitations, this paper introduces DiffCom, a novel end-to-end generative communication paradigm that utilizes off-the-shelf generative priors and probabilistic diffusion models for decoding, thereby improving perceptual quality without heavily relying on bandwidth costs and received signal quality. Unlike traditional systems that rely on deterministic decoders optimized solely for distortion metrics, our DiffCom leverages raw channel-received signal as a fine-grained condition to guide stochastic posterior sampling. Our approach ensures that reconstructions remain on the manifold of real data with a novel confirming constraint, enhancing the robustness and reliability of the generated outcomes. Furthermore, DiffCom incorporates a blind posterior sampling technique to address scenarios with unknown forward transmission characteristics. Extensive experimental validations demonstrate that DiffCom not only produces realistic reconstructions with details faithful to the original data but also achieves superior robustness against diverse wireless transmission degradations. Collectively, these advancements establish DiffCom as a new benchmark in designing generative communication systems that offer enhanced robustness and generalization superiorities.

DiffCom: Channel Received Signal is a Natural Condition to Guide Diffusion Posterior Sampling

TL;DR

This work tackles the perceptual realism and robustness gaps in traditional rate-distortion-optimized end-to-end transmission by introducing DiffCom, which uses the channel-received signal as a fine-grained condition to guide diffusion posterior sampling from pre-trained generative priors. By combining unconditional diffusion models with a likelihood-driven posterior, DiffCom preserves the data distribution while ensuring consistency with the noisy channel, yielding realistic reconstructions even under unseen degradations. The authors further enhance practicality with HiFi-DiffCom for sampling efficiency and Blind-DiffCom for pilot-free scenarios, achieving superior realism (FID) and robust performance across diverse wireless conditions. Together, these methods demonstrate a scalable, robust framework for generative communications that generalizes better than deterministic decoders and opens pathways for perceptual optimization in wireless systems.

Abstract

End-to-end visual communication systems typically optimize a trade-off between channel bandwidth costs and signal-level distortion metrics. However, under challenging physical conditions, this traditional coding and transmission paradigm often results in unrealistic reconstructions with perceptible blurring and aliasing artifacts, despite the inclusion of perceptual or adversarial losses for optimizing. This issue primarily stems from the receiver's limited knowledge about the underlying data manifold and the use of deterministic decoding mechanisms. To address these limitations, this paper introduces DiffCom, a novel end-to-end generative communication paradigm that utilizes off-the-shelf generative priors and probabilistic diffusion models for decoding, thereby improving perceptual quality without heavily relying on bandwidth costs and received signal quality. Unlike traditional systems that rely on deterministic decoders optimized solely for distortion metrics, our DiffCom leverages raw channel-received signal as a fine-grained condition to guide stochastic posterior sampling. Our approach ensures that reconstructions remain on the manifold of real data with a novel confirming constraint, enhancing the robustness and reliability of the generated outcomes. Furthermore, DiffCom incorporates a blind posterior sampling technique to address scenarios with unknown forward transmission characteristics. Extensive experimental validations demonstrate that DiffCom not only produces realistic reconstructions with details faithful to the original data but also achieves superior robustness against diverse wireless transmission degradations. Collectively, these advancements establish DiffCom as a new benchmark in designing generative communication systems that offer enhanced robustness and generalization superiorities.
Paper Structure (30 sections, 26 equations, 16 figures, 7 tables, 3 algorithms)

This paper contains 30 sections, 26 equations, 16 figures, 7 tables, 3 algorithms.

Figures (16)

  • Figure 1: Overview of our DiffCom system architecture, and the illustration of three decoding paradigms. (1) The RD-optimized deterministic decoding result appears overly smooth reconstructions due to the pixel-wise averaging of potential solutions in the source space; (2) The RDP-optimized deterministic decoding promotes reconstructions towards regions of the search space that are likely to contain photo-realistic images, thereby aligning more closely with the natural image manifold; (3) Our RDP-optimized stochastic decoding employs diffusion posterior sampling to generate a diverse array of solutions. The channel received signal serves as a fine-grained condition to guide the sampling process, consistently yielding high-fidelity outcomes. This paradigm enhances both the robustness and generalization capabilities of the decoding process.
  • Figure 2: Probabilistic graph of standard DiffCom. Solid line: tractable, dotted line: intractable in general.
  • Figure 3: Visual results to demonstrate the effect of hyperparameters $\zeta$, where we evaluate all the schemes under AWGN channel with $\text{CSNR} = 0\text{dB}$.
  • Figure 4: Probabilistic graph of HiFi-DiffCom.
  • Figure 5: Probabilistic graph of Blind-DiffCom.
  • ...and 11 more figures