Table of Contents
Fetching ...

Radio Foundation Models: Pre-training Transformers for 5G-based Indoor Localization

Jonathan Ott, Jonas Pirkl, Maximilian Stahlke, Tobias Feigl, Christopher Mutschler

TL;DR

The paper addresses the data bottleneck in indoor 5G radio fingerprinting for localization by introducing a self-supervised pre-training framework for a Transformer on unlabeled channel impulse responses. It proposes a novel pretext task that masks CIR components and trains the model to reconstruct them, enabling environment-specific representations without reference data. After pre-training, a light fine-tuning on a small set of labeled CIRs yields state-of-the-art localization accuracy with an order-of-magnitude reduction in labeled data, demonstrated on two real-world 5G datasets and a synthetic LoS dataset. The results suggest this approach can serve as a foundation model for radio fingerprinting, offering cost-effective, robust indoor localization and groundwork for future extension to dynamic environments and other radio systems.

Abstract

Artificial Intelligence (AI)-based radio fingerprinting (FP) outperforms classic localization methods in propagation environments with strong multipath effects. However, the model and data orchestration of FP are time-consuming and costly, as it requires many reference positions and extensive measurement campaigns for each environment. Instead, modern unsupervised and self-supervised learning schemes require less reference data for localization, but either their accuracy is low or they require additional sensor information, rendering them impractical. In this paper we propose a self-supervised learning framework that pre-trains a general transformer (TF) neural network on 5G channel measurements that we collect on-the-fly without expensive equipment. Our novel pretext task randomly masks and drops input information to learn to reconstruct it. So, it implicitly learns the spatiotemporal patterns and information of the propagation environment that enable FP-based localization. Most interestingly, when we optimize this pre-trained model for localization in a given environment, it achieves the accuracy of state-of-the-art methods but requires ten times less reference data and significantly reduces the time from training to operation.

Radio Foundation Models: Pre-training Transformers for 5G-based Indoor Localization

TL;DR

The paper addresses the data bottleneck in indoor 5G radio fingerprinting for localization by introducing a self-supervised pre-training framework for a Transformer on unlabeled channel impulse responses. It proposes a novel pretext task that masks CIR components and trains the model to reconstruct them, enabling environment-specific representations without reference data. After pre-training, a light fine-tuning on a small set of labeled CIRs yields state-of-the-art localization accuracy with an order-of-magnitude reduction in labeled data, demonstrated on two real-world 5G datasets and a synthetic LoS dataset. The results suggest this approach can serve as a foundation model for radio fingerprinting, offering cost-effective, robust indoor localization and groundwork for future extension to dynamic environments and other radio systems.

Abstract

Artificial Intelligence (AI)-based radio fingerprinting (FP) outperforms classic localization methods in propagation environments with strong multipath effects. However, the model and data orchestration of FP are time-consuming and costly, as it requires many reference positions and extensive measurement campaigns for each environment. Instead, modern unsupervised and self-supervised learning schemes require less reference data for localization, but either their accuracy is low or they require additional sensor information, rendering them impractical. In this paper we propose a self-supervised learning framework that pre-trains a general transformer (TF) neural network on 5G channel measurements that we collect on-the-fly without expensive equipment. Our novel pretext task randomly masks and drops input information to learn to reconstruct it. So, it implicitly learns the spatiotemporal patterns and information of the propagation environment that enable FP-based localization. Most interestingly, when we optimize this pre-trained model for localization in a given environment, it achieves the accuracy of state-of-the-art methods but requires ten times less reference data and significantly reduces the time from training to operation.
Paper Structure (24 sections, 4 figures, 1 table)

This paper contains 24 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: Two-step training pipeline: pre-training on CIR wo. references (left) and fine-tuning on CIR w. references (right).
  • Figure 2: Schematic overview of our TF pre-training method.
  • Figure 3: CIR-magnitude of a fingerprint with $N_{an}$CIR. Input CIR (bottom) and reconstruction (top; red: masked parts).
  • Figure 4: Localization error (CE90) of all methods w.r.t. the reference measurements we use for fine-tuning. Trained and tested on data of the same scenario. Except TF-C-PT, that we pre-train on data of one scenario and test it on data of the other scenario.