Table of Contents
Fetching ...

Vision Transformers for Efficient Indoor Pathloss Radio Map Prediction

Rafayel Mkrtchyan, Edvard Ghukasyan, Khoren Petrosyan, Hrant Khachatrian, Theofanis P. Raptis

TL;DR

We address indoor pathloss map prediction by employing a pretrained vision transformer (DINOv2) to encode floor-map features and predict high-resolution radio maps. The approach uses a ViT encoder with a convolutional neck and a UPerNet decoder, together with extensive data augmentation and targeted feature engineering (e.g., wall obstructions) to combat data scarcity. Across multiple generalization scenarios (unseen buildings, frequencies, and antenna patterns), the method demonstrates robustness and competitive performance against indoor-pathloss baselines, highlighting the practical viability for indoor wireless planning. The work also analyzes distribution shifts and provides guidance for scalable, data-efficient indoor radio-map prediction in real-world deployments.

Abstract

Indoor pathloss prediction is a fundamental task in wireless network planning, yet it remains challenging due to environmental complexity and data scarcity. In this work, we propose a deep learning-based approach utilizing a vision transformer (ViT) architecture with DINO-v2 pretrained weights to model indoor radio propagation. Our method processes a floor map with additional features of the walls to generate indoor pathloss maps. We systematically evaluate the effects of architectural choices, data augmentation strategies, and feature engineering techniques. Our findings indicate that extensive augmentation significantly improves generalization, while feature engineering is crucial in low-data regimes. Through comprehensive experiments, we demonstrate the robustness of our model across different generalization scenarios.

Vision Transformers for Efficient Indoor Pathloss Radio Map Prediction

TL;DR

We address indoor pathloss map prediction by employing a pretrained vision transformer (DINOv2) to encode floor-map features and predict high-resolution radio maps. The approach uses a ViT encoder with a convolutional neck and a UPerNet decoder, together with extensive data augmentation and targeted feature engineering (e.g., wall obstructions) to combat data scarcity. Across multiple generalization scenarios (unseen buildings, frequencies, and antenna patterns), the method demonstrates robustness and competitive performance against indoor-pathloss baselines, highlighting the practical viability for indoor wireless planning. The work also analyzes distribution shifts and provides guidance for scalable, data-efficient indoor radio-map prediction in real-world deployments.

Abstract

Indoor pathloss prediction is a fundamental task in wireless network planning, yet it remains challenging due to environmental complexity and data scarcity. In this work, we propose a deep learning-based approach utilizing a vision transformer (ViT) architecture with DINO-v2 pretrained weights to model indoor radio propagation. Our method processes a floor map with additional features of the walls to generate indoor pathloss maps. We systematically evaluate the effects of architectural choices, data augmentation strategies, and feature engineering techniques. Our findings indicate that extensive augmentation significantly improves generalization, while feature engineering is crucial in low-data regimes. Through comprehensive experiments, we demonstrate the robustness of our model across different generalization scenarios.

Paper Structure

This paper contains 21 sections, 5 figures, 7 tables.

Figures (5)

  • Figure S1: An example of the three given input channels (a--c), the radiation pattern channel that we created (d), and the target (e).
  • Figure S2: An example of the input (a--d) channels, and the target (e) for a training sample after applying all the augmentations, normalization, and padding.
  • Figure S3: Model architecture.
  • Figure S4: Input examples from the challenge test set (a), from our test set (b), and from the crops that we generated (c).
  • Figure S5: Wall density vs the average value of transmittance and reflectance on the walls. The figure indicates the distribution shift from our dataset to the challenge test set and how we tried to address it with manual crops.