Table of Contents
Fetching ...

3D Dynamic Radio Map Prediction Using Vision Transformers for Low-Altitude Wireless Networks

Nguyen Duc Minh Quang, Chang Liu, Huy-Trung Nguyen, Shuangyang Li, Derrick Wing Kwan Ng, Wei Xiang

TL;DR

This work tackles the problem of real-time prediction of time-varying 3D radio maps in low-altitude wireless networks by introducing 3D-DRM, a Transformer-based framework that fuses a Vision Transformer for spatial encoding with a temporal Transformer for dynamics. It jointly reconstructs and forecasts the 3D RM from sparse UAV measurements, enabling proactive power control and resource allocation. The method employs a Power Measurement Encoder and a cross-attentive Radio Map Decoder, trained with voxel-level fidelity and temporal-smoothness losses to ensure accurate and coherent dynamics. Experiments demonstrate that 3D-DRM outperforms ConvLSTM and RadioUNet across RMSE and temporal consistency, highlighting its potential for real-time, scalable network optimization in UAV-enabled LAWN scenarios.

Abstract

Low-altitude wireless networks (LAWN) are rapidly expanding with the growing deployment of unmanned aerial vehicles (UAVs) for logistics, surveillance, and emergency response. Reliable connectivity remains a critical yet challenging task due to three-dimensional (3D) mobility, time-varying user density, and limited power budgets. The transmit power of base stations (BSs) fluctuates dynamically according to user locations and traffic demands, leading to a highly non-stationary 3D radio environment. Radio maps (RMs) have emerged as an effective means to characterize spatial power distributions and support radio-aware network optimization. However, most existing works construct static or offline RMs, overlooking real-time power variations and spatio-temporal dependencies in multi-UAV networks. To overcome this limitation, we propose a 3D dynamic radio map (3D-DRM) framework that learns and predicts the spatio-temporal evolution of received power. Specially, a Vision Transformer (ViT) encoder extracts high-dimensional spatial representations from 3D RMs, while a Transformer-based module models sequential dependencies to predict future power distributions. Experiments unveil that 3D-DRM accurately captures fast-varying power dynamics and substantially outperforms baseline models in both RM reconstruction and short-term prediction.

3D Dynamic Radio Map Prediction Using Vision Transformers for Low-Altitude Wireless Networks

TL;DR

This work tackles the problem of real-time prediction of time-varying 3D radio maps in low-altitude wireless networks by introducing 3D-DRM, a Transformer-based framework that fuses a Vision Transformer for spatial encoding with a temporal Transformer for dynamics. It jointly reconstructs and forecasts the 3D RM from sparse UAV measurements, enabling proactive power control and resource allocation. The method employs a Power Measurement Encoder and a cross-attentive Radio Map Decoder, trained with voxel-level fidelity and temporal-smoothness losses to ensure accurate and coherent dynamics. Experiments demonstrate that 3D-DRM outperforms ConvLSTM and RadioUNet across RMSE and temporal consistency, highlighting its potential for real-time, scalable network optimization in UAV-enabled LAWN scenarios.

Abstract

Low-altitude wireless networks (LAWN) are rapidly expanding with the growing deployment of unmanned aerial vehicles (UAVs) for logistics, surveillance, and emergency response. Reliable connectivity remains a critical yet challenging task due to three-dimensional (3D) mobility, time-varying user density, and limited power budgets. The transmit power of base stations (BSs) fluctuates dynamically according to user locations and traffic demands, leading to a highly non-stationary 3D radio environment. Radio maps (RMs) have emerged as an effective means to characterize spatial power distributions and support radio-aware network optimization. However, most existing works construct static or offline RMs, overlooking real-time power variations and spatio-temporal dependencies in multi-UAV networks. To overcome this limitation, we propose a 3D dynamic radio map (3D-DRM) framework that learns and predicts the spatio-temporal evolution of received power. Specially, a Vision Transformer (ViT) encoder extracts high-dimensional spatial representations from 3D RMs, while a Transformer-based module models sequential dependencies to predict future power distributions. Experiments unveil that 3D-DRM accurately captures fast-varying power dynamics and substantially outperforms baseline models in both RM reconstruction and short-term prediction.

Paper Structure

This paper contains 17 sections, 16 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: An illustration of the cellular-connected low-altitude wireless network.
  • Figure 2: Overall architecture of the proposed 3D-DRM framework. (a) Transformer block used in both the encoder and decoder to model token-wise attention; (b) The encoder embeds irregular UAV measurements into latent features through spatial attention, while the temporal module captures their dynamic evolution over time; (c) The decoder applies cross-attention between voxel queries and latent embeddings to reconstruct the complete 3D RM.
  • Figure 3: Quantitative comparison of model performance: (a) RMSE versus number of UAVs, (b) temporal gradient error versus number of UAVs, and (c) overall spatial and temporal performance comparison among ConvLSTM, RadioUNet, and the proposed 3D-DRM.
  • Figure 4: Visualization of ground-truth and predicted 3D RMs.