Table of Contents
Fetching ...

Exploring the Feasibility of Automated Data Standardization using Large Language Models for Seamless Positioning

Max J. L. Lee, Ju Lin, Li-Ta Hsu

TL;DR

This work investigates a real-time feasibility framework for automated data standardization in seamless IoT positioning by leveraging fine-tuned LLMs to standardize heterogeneous sensor data (from smartphones, IoT devices, and UWB) and by using an EKF-based sensor fusion to improve localization accuracy. It introduces the Intelligent Data Standardization Module (IDSM) and the Transformation Rule Generation Module (TRGM) to automate data normalization and rule/script generation, validated through training on 100 complex examples and extensive unit tests. The experimental results show that integrating GNSS, UWB, VPS, and IMU data via EKF achieves markedly lower positioning errors (RMSE ≈ 0.35 m, MAE ≈ 0.25 m) than single-sensor baselines, highlighting the practical potential for scalable, precise IoT navigation. Nonetheless, dependence on predefined schemas and covariance settings, along with controlled evaluation conditions, suggests avenues for further robustness enhancements and dynamic adaptation in real-world deployments.

Abstract

We propose a feasibility study for real-time automated data standardization leveraging Large Language Models (LLMs) to enhance seamless positioning systems in IoT environments. By integrating and standardizing heterogeneous sensor data from smartphones, IoT devices, and dedicated systems such as Ultra-Wideband (UWB), our study ensures data compatibility and improves positioning accuracy using the Extended Kalman Filter (EKF). The core components include the Intelligent Data Standardization Module (IDSM), which employs a fine-tuned LLM to convert varied sensor data into a standardized format, and the Transformation Rule Generation Module (TRGM), which automates the creation of transformation rules and scripts for ongoing data standardization. Evaluated in real-time environments, our study demonstrates adaptability and scalability, enhancing operational efficiency and accuracy in seamless navigation. This study underscores the potential of advanced LLMs in overcoming sensor data integration complexities, paving the way for more scalable and precise IoT navigation solutions.

Exploring the Feasibility of Automated Data Standardization using Large Language Models for Seamless Positioning

TL;DR

This work investigates a real-time feasibility framework for automated data standardization in seamless IoT positioning by leveraging fine-tuned LLMs to standardize heterogeneous sensor data (from smartphones, IoT devices, and UWB) and by using an EKF-based sensor fusion to improve localization accuracy. It introduces the Intelligent Data Standardization Module (IDSM) and the Transformation Rule Generation Module (TRGM) to automate data normalization and rule/script generation, validated through training on 100 complex examples and extensive unit tests. The experimental results show that integrating GNSS, UWB, VPS, and IMU data via EKF achieves markedly lower positioning errors (RMSE ≈ 0.35 m, MAE ≈ 0.25 m) than single-sensor baselines, highlighting the practical potential for scalable, precise IoT navigation. Nonetheless, dependence on predefined schemas and covariance settings, along with controlled evaluation conditions, suggests avenues for further robustness enhancements and dynamic adaptation in real-world deployments.

Abstract

We propose a feasibility study for real-time automated data standardization leveraging Large Language Models (LLMs) to enhance seamless positioning systems in IoT environments. By integrating and standardizing heterogeneous sensor data from smartphones, IoT devices, and dedicated systems such as Ultra-Wideband (UWB), our study ensures data compatibility and improves positioning accuracy using the Extended Kalman Filter (EKF). The core components include the Intelligent Data Standardization Module (IDSM), which employs a fine-tuned LLM to convert varied sensor data into a standardized format, and the Transformation Rule Generation Module (TRGM), which automates the creation of transformation rules and scripts for ongoing data standardization. Evaluated in real-time environments, our study demonstrates adaptability and scalability, enhancing operational efficiency and accuracy in seamless navigation. This study underscores the potential of advanced LLMs in overcoming sensor data integration complexities, paving the way for more scalable and precise IoT navigation solutions.
Paper Structure (18 sections, 10 equations, 6 figures, 7 tables)

This paper contains 18 sections, 10 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Overview of the Proposed Feasibility Study for Automated Data Standardization and Sensor Fusion.
  • Figure 2: Training and Validation Loss of the Intelligent Data Standardization Module (IDSM) over Steps.
  • Figure 3: Mean Token Accuracy of the Intelligent Data Standardization Module (IDSM) over Steps.
  • Figure 4: Experiment Path and Ultra-Wideband (UWB) Setup for Data Collection.
  • Figure 5: Comparison of Positioning Results from Various Methods with Ground Truth.
  • ...and 1 more figures