Robust Semantic Transmission for Low-Altitude UAVs: Predictive Channel-Aware Scheduling and Generative Reconstruction
Jijia Tian, Junting Chen, Pooi-Yuen Kam
TL;DR
The paper tackles robust semantic downlink for low-altitude UAVs under bandwidth constraints and highly time-varying A2G channels. It introduces a predictive semantic framework that decouples content into a deterministic structural component and a stochastic texture component using a Structure-Texture Variational Autoencoder (ST-VAE), along with a channel-aware scheduler that prioritizes structure transmission and employs a receiver-side conditional generative prior for texture synthesis when blocks are missing. The key contributions include (i) a trajectory-informed SNR predictor feeding a per-slot budgeting and block-scheduling algorithm, (ii) explicit semantic encoding with a structure-first policy and texture completion via a conditional prior, and (iii) substantial performance gains (up to 5.6 dB PSNR) and robustness to prediction mismatch in simulations using COCO data and a realistic A2G channel model. Collectively, the approach mitigates cliff effects inherent in coupled DeepJSCC, improving perceptual and structural fidelity in challenging UAV downlink scenarios with strict bandwidth budgets.
Abstract
Unmanned aerial vehicle (UAV) downlink transmission facilitates critical time-sensitive visual applications but is fundamentally constrained by bandwidth scarcity and dynamic channel impairments. The rapid fluctuation of the air-to-ground (A2G) link creates a regime where reliable transmission slots are intermittent and future channel quality can only be predicted with uncertainty. Conventional deep joint source-channel coding (DeepJSCC) methods transmit coupled feature streams, causing global reconstruction failure when specific time slots experience deep fading. Decoupling semantic content into a deterministic structure component and a stochastic texture component enables differentiated error protection strategies aligned with channel reliability. A predictive transmission framework is developed that utilizes a split-stream variational codec and a channel-aware scheduler to prioritize the delivery of structural layout over reliable slots. Experimental evaluations indicate that this approach achieves a 5.6 dB gain in peak signal-to-noise (SNR) ratio over single-stream baselines and maintains structural fidelity under significant prediction mismatch.
