DeviceRadar: Online IoT Device Fingerprinting in ISPs using Programmable Switches
Ruoyu Li, Qing Li, Tao Lin, Qingsong Zou, Dan Zhao, Yucheng Huang, Gareth Tyson, Guorui Xie, Yong Jiang
TL;DR
DeviceRadar tackles online IoT device fingerprinting in ISP networks where middleboxes obscure traditional signals and sheer traffic volume requires real-time, high-throughput processing. It introduces a novel in-network framework that uses key packets—defined by stable packet sizes and directions—and an NLP-inspired packet embedding to build a neighboring key packet distribution, which feeds a per-device CART classifier deployed entirely on a P4 data plane. The control plane learns embeddings, probability matrices, and key packets offline, while the data plane performs line-rate inference with constrained operations and memory, achieving 40 Gbps throughput and sub-millisecond latency. Across 77 IoT devices and through NAT/VPN middlebox scenarios, DeviceRadar attains state-of-the-art accuracy with dramatically lower processing time than GPU-based approaches, demonstrating practical viability for ISP defense workflows and real-time mitigation of IoT-driven threats.
Abstract
Device fingerprinting can be used by Internet Service Providers (ISPs) to identify vulnerable IoT devices for early prevention of threats. However, due to the wide deployment of middleboxes in ISP networks, some important data, e.g., 5-tuples and flow statistics, are often obscured, rendering many existing approaches invalid. It is further challenged by the high-speed traffic of hundreds of terabytes per day in ISP networks. This paper proposes DeviceRadar, an online IoT device fingerprinting framework that achieves accurate, real-time processing in ISPs using programmable switches. We innovatively exploit "key packets" as a basis of fingerprints only using packet sizes and directions, which appear periodically while exhibiting differences across different IoT devices. To utilize them, we propose a packet size embedding model to discover the spatial relationships between packets. Meanwhile, we design an algorithm to extract the "key packets" of each device, and propose an approach that jointly considers the spatial relationships and the key packets to produce a neighboring key packet distribution, which can serve as a feature vector for machine learning models for inference. Last, we design a model transformation method and a feature extraction process to deploy the model on a programmable data plane within its constrained arithmetic operations and memory to achieve line-speed processing. Our experiments show that DeviceRadar can achieve state-of-the-art accuracy across 77 IoT devices with 40 Gbps throughput, and requires only 1.3% of the processing time compared to GPU-accelerated approaches.
