Towards Detecting IoT Event Spoofing Attacks Using Time-Series Classification
Uzma Maroof, Gustavo Batista, Arash Shaghaghi, Sanjay Jha
TL;DR
This work tackles IoT event spoofing in home automation by leveraging time-series classification in a dissimilarity-space representation built on Dynamic Time Warping. By mapping multivariate sensor data to a fixed-dimensional space using distances to carefully chosen prototypes, the approach preserves temporal structure while requiring far smaller training sets than traditional statistical-feature methods. Evaluated on the real-world PEEVES dataset, the method achieves comparable detection performance with up to 500x less labeled data and attains 100% accuracy for at least one event in end-to-end testing, highlighting practical viability for IoT deployments. The study also explores extensive model selection and downsampling strategies, and the authors release their system publicly to encourage further research and real-world experimentation.
Abstract
Internet of Things (IoT) devices have grown in popularity since they can directly interact with the real world. Home automation systems automate these interactions. IoT events are crucial to these systems' decision-making but are often unreliable. Security vulnerabilities allow attackers to impersonate events. Using statistical machine learning, IoT event fingerprints from deployed sensors have been used to detect spoofed events. Multivariate temporal data from these sensors has structural and temporal properties that statistical machine learning cannot learn. These schemes' accuracy depends on the knowledge base; the larger, the more accurate. However, the lack of huge datasets with enough samples of each IoT event in the nascent field of IoT can be a bottleneck. In this work, we deployed advanced machine learning to detect event-spoofing assaults. The temporal nature of sensor data lets us discover important patterns with fewer events. Our rigorous investigation of a publicly available real-world dataset indicates that our time-series-based solution technique learns temporal features from sensor data faster than earlier work, even with a 100- or 500-fold smaller training sample, making it a realistic IoT solution.
