LuViRA Dataset Validation and Discussion: Comparing Vision, Radio, and Audio Sensors for Indoor Localization

Ilayda Yaman; Guoda Tian; Erik Tegler; Jens Gulin; Nikhil Challa; Fredrik Tufvesson; Ove Edfors; Kalle Astrom; Steffen Malkowsky; Liang Liu

LuViRA Dataset Validation and Discussion: Comparing Vision, Radio, and Audio Sensors for Indoor Localization

Ilayda Yaman, Guoda Tian, Erik Tegler, Jens Gulin, Nikhil Challa, Fredrik Tufvesson, Ove Edfors, Kalle Astrom, Steffen Malkowsky, Liang Liu

TL;DR

LuViRA introduces a synchronized, multi-sensor indoor localization dataset combining vision, radio, and audio modalities with ground-truth trajectories. The paper benchmarks state-of-the-art algorithms—ORB-SLAM3 for vision, ICC for radio, and SFS2 for audio—across grid and random trajectories to compare accuracy, reliability, calibration needs, and complexity. Results show vision delivers high accuracy in many cases, audio can outperform in moving-object scenarios, and radio is robust to low SNR yet struggles with unpredictable trajectories, motivating sensor fusion. Overall, LuViRA provides a practical roadmap for developing robust, multi-sensory localization systems in realistic indoor environments.

Abstract

We present a unique comparative analysis, and evaluation of vision, radio, and audio based localization algorithms. We create the first baseline for the aforementioned sensors using the recently published Lund University Vision, Radio, and Audio (LuViRA) dataset, where all the sensors are synchronized and measured in the same environment. Some of the challenges of using each specific sensor for indoor localization tasks are highlighted. Each sensor is paired with a current state-of-the-art localization algorithm and evaluated for different aspects: localization accuracy, reliability and sensitivity to environment changes, calibration requirements, and potential system complexity. Specifically, the evaluation covers the ORB-SLAM3 algorithm for vision-based localization with an RGB-D camera, a machine-learning algorithm for radio-based localization with massive MIMO technology, and the SFS2 algorithm for audio-based localization with distributed microphones. The results can serve as a guideline and basis for further development of robust and high-precision multi-sensory localization systems, e.g., through sensor fusion, context, and environment-aware adaptation.

LuViRA Dataset Validation and Discussion: Comparing Vision, Radio, and Audio Sensors for Indoor Localization

TL;DR

Abstract

Paper Structure (16 sections, 11 figures, 5 tables)

This paper contains 16 sections, 11 figures, 5 tables.

Introduction
Related Works
Background
Vision System
Radio System
Audio System
LUVIRA Dataset
Vision System
Radio System
Audio System
Results and Discussion
Localization accuracy
Sensitivity to dynamic environment
Sensitivity to Signal-to-Noise Ratio
Calibration requirements
...and 1 more sections

Figures (11)

Figure 1: An overview of the measurement setup during the recording of the Random circle1 trajectory.
Figure 2: Selected vision-based localization algorithm pipeline.
Figure 3: Selected radio-based localization algorithm pipeline where the fully connected neural network layer 1 (FCNN1) extracts angular information and FCNN2 extracts delay information through the respective hidden layers (HL).
Figure 4: Selected audio-based localization algorithm pipeline for the RC2 trajectory.
Figure 5: Ground truth (black) and the result of the ORB-SLAM3 (blue) for Grid110 and RC1.
...and 6 more figures

LuViRA Dataset Validation and Discussion: Comparing Vision, Radio, and Audio Sensors for Indoor Localization

TL;DR

Abstract

LuViRA Dataset Validation and Discussion: Comparing Vision, Radio, and Audio Sensors for Indoor Localization

Authors

TL;DR

Abstract

Table of Contents

Figures (11)