Se-HiLo: Noise-Resilient Semantic Communication with High-and-Low Frequency Decomposition
Zhiyuan Xi, Kun Zhu, Yuanyuan Xu
TL;DR
Se-HiLo addresses unpredictable semantic noise in image transmission by replacing adversarial training with a Finite Scalar Quantization (FSQ) module and augmenting it with a transformer-based High-and-Low Frequency Decomposition (HiLo). FSQ enforces robust discrete representations, and HiLo decouples information into high- and low-frequency components mapped to separate FSQ spaces to preserve representational diversity. The approach shows improved noise resilience and reconstruction accuracy across diverse SNRs, outperforming baseline semantic communication methods and avoiding unstable training dynamics associated with adversarial strategies. This has practical implications for scalable, noise-robust semantic transmission in real-world communication systems where noise patterns are unpredictable.
Abstract
Semantic communication has emerged as a transformative paradigm in next-generation communication systems, leveraging advanced artificial intelligence (AI) models to extract and transmit semantic representations for efficient information exchange. Nevertheless, the presence of unpredictable semantic noise, such as ambiguity and distortions in transmitted representations, often undermines the reliability of received information. Conventional approaches primarily adopt adversarial training with noise injection to mitigate the adverse effects of noise. However, such methods exhibit limited adaptability to varying noise levels and impose additional computational overhead during model training. To address these challenges, this paper proposes Noise-Resilient \textbf{Se}mantic Communication with \textbf{Hi}gh-and-\textbf{Lo}w Frequency Decomposition (Se-HiLo) for image transmission. The proposed Se-HiLo incorporates a Finite Scalar Quantization (FSQ) based noise-resilient module, which bypasses adversarial training by enforcing encoded representations within predefined spaces to enhance noise resilience. While FSQ improves robustness, it compromise representational diversity. To alleviate this trade-off, we adopt a transformer-based high-and-low frequency decomposition module that decouples image representations into high-and-low frequency components, mapping them into separate FSQ representation spaces to preserve representational diversity. Extensive experiments demonstrate that Se-HiLo achieves superior noise resilience and ensures accurate semantic communication across diverse noise environments.
