Learning Semantic Association Rules from Internet of Things Data
Erkan Karabulut, Paul Groth, Victoria Degeler
TL;DR
The paper tackles learning rules from IoT data by integrating static knowledge graphs with dynamic sensor streams to generate semantic association rules. It introduces Aerial, a neurosymbolic ARM method that extracts rules from a neural representation learned by an under-complete denoising autoencoder, achieving full data coverage with a concise rule set. Across three IoT datasets in water-network and HVAC domains, semantics-enhanced ARM improves rule support and data coverage, while Aerial outperforms state-of-the-art baselines in both rule quality (e.g., higher Zhang's metric) and conciseness. This work advances scalable, explainable IoT rule mining and is compatible with existing ARM variants, enabling broader applicability and downstream tasks such as leakage detection and fault diagnosis.
Abstract
Association Rule Mining (ARM) is the task of discovering commonalities in data in the form of logical implications. ARM is used in the Internet of Things (IoT) for different tasks including monitoring and decision-making. However, existing methods give limited consideration to IoT-specific requirements such as heterogeneity and volume. Furthermore, they do not utilize important static domain-specific description data about IoT systems, which is increasingly represented as knowledge graphs. In this paper, we propose a novel ARM pipeline for IoT data that utilizes both dynamic sensor data and static IoT system metadata. Furthermore, we propose an Autoencoder-based Neurosymbolic ARM method (Aerial) as part of the pipeline to address the high volume of IoT data and reduce the total number of rules that are resource-intensive to process. Aerial learns a neural representation of a given data and extracts association rules from this representation by exploiting the reconstruction (decoding) mechanism of an autoencoder. Extensive evaluations on 3 IoT datasets from 2 domains show that ARM on both static and dynamic IoT data results in more generically applicable rules while Aerial can learn a more concise set of high-quality association rules than the state-of-the-art with full coverage over the datasets.
