Table of Contents
Fetching ...

LLM-based event abstraction and integration for IoT-sourced logs

Mohsen Shirali, Mohammadreza Fani Sani, Zahra Ahmadi, Estefania Serral

TL;DR

This paper shows that Large Language Models can automate the preprocessing of IoT sensor streams into Process Mining-ready event logs by abstracting low-level sensor changes into high-level activities and integrating multi-source logs. Using a GPT-4–based framework with few-shot prompts, the approach handles binary sensor states and cross-source alignment, producing PM-friendly event logs. Evaluation on an ambient-assisted-living dataset demonstrates competitive activity-label accuracy across modalities and reasonable conformance to ground-truth logs, with self-loops influencing alignment metrics. The work highlights significant potential for automated, real-time, multi-source event log generation while acknowledging the need for human oversight to ensure reliability in critical applications.

Abstract

The continuous flow of data collected by Internet of Things (IoT) devices, has revolutionised our ability to understand and interact with the world across various applications. However, this data must be prepared and transformed into event data before analysis can begin. In this paper, we shed light on the potential of leveraging Large Language Models (LLMs) in event abstraction and integration. Our approach aims to create event records from raw sensor readings and merge the logs from multiple IoT sources into a single event log suitable for further Process Mining applications. We demonstrate the capabilities of LLMs in event abstraction considering a case study for IoT application in elderly care and longitudinal health monitoring. The results, showing on average an accuracy of 90% in detecting high-level activities. These results highlight LLMs' promising potential in addressing event abstraction and integration challenges, effectively bridging the existing gap.

LLM-based event abstraction and integration for IoT-sourced logs

TL;DR

This paper shows that Large Language Models can automate the preprocessing of IoT sensor streams into Process Mining-ready event logs by abstracting low-level sensor changes into high-level activities and integrating multi-source logs. Using a GPT-4–based framework with few-shot prompts, the approach handles binary sensor states and cross-source alignment, producing PM-friendly event logs. Evaluation on an ambient-assisted-living dataset demonstrates competitive activity-label accuracy across modalities and reasonable conformance to ground-truth logs, with self-loops influencing alignment metrics. The work highlights significant potential for automated, real-time, multi-source event log generation while acknowledging the need for human oversight to ensure reliability in critical applications.

Abstract

The continuous flow of data collected by Internet of Things (IoT) devices, has revolutionised our ability to understand and interact with the world across various applications. However, this data must be prepared and transformed into event data before analysis can begin. In this paper, we shed light on the potential of leveraging Large Language Models (LLMs) in event abstraction and integration. Our approach aims to create event records from raw sensor readings and merge the logs from multiple IoT sources into a single event log suitable for further Process Mining applications. We demonstrate the capabilities of LLMs in event abstraction considering a case study for IoT application in elderly care and longitudinal health monitoring. The results, showing on average an accuracy of 90% in detecting high-level activities. These results highlight LLMs' promising potential in addressing event abstraction and integration challenges, effectively bridging the existing gap.
Paper Structure (12 sections, 2 figures, 3 tables)

This paper contains 12 sections, 2 figures, 3 tables.

Figures (2)

  • Figure 1: The house floor plan, location and types of ambient sensors
  • Figure 2: A view of (a) sensor logs and (b) resulted event log (the ground truth version) for the experimented IoT dataset.