Table of Contents
Fetching ...

Removing the need for ground truth UWB data collection: self-supervised ranging error correction using deep reinforcement learning

Dieter Coppens, Ben Van Herbruggen, Adnan Shahid, Eli De Poorter

TL;DR

Experiments on real-world UWB measurements demonstrate comparable performance to state-of-the-art supervised methods, overcoming data dependency and lack of generalizability limitations, which makes self-supervised deep reinforcement learning a promising solution for practical and scalable UWB-ranging error correction.

Abstract

Indoor positioning using UWB technology has gained interest due to its centimeter-level accuracy potential. However, multipath effects and non-line-of-sight conditions cause ranging errors between anchors and tags. Existing approaches for mitigating these ranging errors rely on collecting large labeled datasets, making them impractical for real-world deployments. This paper proposes a novel self-supervised deep reinforcement learning approach that does not require labeled ground truth data. A reinforcement learning agent uses the channel impulse response as a state and predicts corrections to minimize the error between corrected and estimated ranges. The agent learns, self-supervised, by iteratively improving corrections that are generated by combining the predictability of trajectories with filtering and smoothening. Experiments on real-world UWB measurements demonstrate comparable performance to state-of-the-art supervised methods, overcoming data dependency and lack of generalizability limitations. This makes self-supervised deep reinforcement learning a promising solution for practical and scalable UWB-ranging error correction.

Removing the need for ground truth UWB data collection: self-supervised ranging error correction using deep reinforcement learning

TL;DR

Experiments on real-world UWB measurements demonstrate comparable performance to state-of-the-art supervised methods, overcoming data dependency and lack of generalizability limitations, which makes self-supervised deep reinforcement learning a promising solution for practical and scalable UWB-ranging error correction.

Abstract

Indoor positioning using UWB technology has gained interest due to its centimeter-level accuracy potential. However, multipath effects and non-line-of-sight conditions cause ranging errors between anchors and tags. Existing approaches for mitigating these ranging errors rely on collecting large labeled datasets, making them impractical for real-world deployments. This paper proposes a novel self-supervised deep reinforcement learning approach that does not require labeled ground truth data. A reinforcement learning agent uses the channel impulse response as a state and predicts corrections to minimize the error between corrected and estimated ranges. The agent learns, self-supervised, by iteratively improving corrections that are generated by combining the predictability of trajectories with filtering and smoothening. Experiments on real-world UWB measurements demonstrate comparable performance to state-of-the-art supervised methods, overcoming data dependency and lack of generalizability limitations. This makes self-supervised deep reinforcement learning a promising solution for practical and scalable UWB-ranging error correction.
Paper Structure (25 sections, 19 equations, 9 figures, 5 tables, 1 algorithm)

This paper contains 25 sections, 19 equations, 9 figures, 5 tables, 1 algorithm.

Figures (9)

  • Figure 1: Conceptual illustration of the idea behind UWB ranging error correction
  • Figure 2: Illustration of the mathematical UWB localization system description
  • Figure 3: Complete overview of the proposed (adapted) DDPG algorithm for UWB error correction
  • Figure 4: Performance comparison of our proposed RL algorithm during training with uncorrected UWB ranging and a supervised CNN approach in terms of MAE. The figure shows that our proposed algorithm quickly improves the ranging performance compared to uncorrected UWB ranging, and later surpasses the supervised CNN performance.
  • Figure 5: Trajectory comparison of the original EKF (with no RL correction) with the improved trajectories during training after 300 and 500 episodes
  • ...and 4 more figures