IoTWarden: A Deep Reinforcement Learning Based Real-time Defense System to Mitigate Trigger-action IoT Attacks
Md Morshed Alam, Israt Jahan, Weichao Wang
TL;DR
IoTWarden addresses the security gap in trigger-action IoT platforms by modeling remote injection attacks as an RL problem and learning a real-time defense policy. The system casts the defense as a Markov Decision Process and uses a Deep Q-Network to maximize a reward that balances blocking injections with maintaining availability, guided by a specially crafted state machine and a reward structure that accounts for attack proximity. Key contributions include an LSTM-based attacker-sequence predictor, a three-component IoTWarden architecture (state machine generator, policy determiner, policy enforcer), and comprehensive simulations showing stable rewards with low overhead under varying attacker aggressiveness. The work demonstrates the practical viability of RL-driven defense for realtime IoT security and provides a foundation for integrating such defenses with existing static analysis approaches.
Abstract
In trigger-action IoT platforms, IoT devices report event conditions to IoT hubs notifying their cyber states and let the hubs invoke actions in other IoT devices based on functional dependencies defined as rules in a rule engine. These functional dependencies create a chain of interactions that help automate network tasks. Adversaries exploit this chain to report fake event conditions to IoT hubs and perform remote injection attacks upon a smart environment to indirectly control targeted IoT devices. Existing defense efforts usually depend on static analysis over IoT apps to develop rule-based anomaly detection mechanisms. We also see ML-based defense mechanisms in the literature that harness physical event fingerprints to determine anomalies in an IoT network. However, these methods often demonstrate long response time and lack of adaptability when facing complicated attacks. In this paper, we propose to build a deep reinforcement learning based real-time defense system for injection attacks. We define the reward functions for defenders and implement a deep Q-network based approach to identify the optimal defense policy. Our experiments show that the proposed mechanism can effectively and accurately identify and defend against injection attacks with reasonable computation overhead.
