ImmCOGNITO: Identity Obfuscation in Millimeter-Wave Radar-Based Gesture Recognition for IoT Environments
Ying Liu, Si Zuo, Chao Yang, Yuqing Song, Dariush Salami, Stephan Sigg
TL;DR
The paper addresses identity leakage in mmWave radar gesture data by proposing ImmCOGNITO, a graph-based autoencoder that preserves gesture-relevant structure while suppressing identity cues. By constructing a Temporal Graph and applying a message-passing neural network with multi-head self-attention, the model reconstructs a de-identified point cloud $p'$ that retains gesture discriminability while reducing identity recognition accuracy. The training jointly optimizes reconstruction, gesture preservation, and de-identification losses, with a stabilized de-identification objective and a gated final loss to balance utility and privacy. Empirical results on PantoRad and MHomeGes show substantial privacy gains (identification accuracy dropping from tens of percent to single digits) with modest gesture-utility loss, demonstrating a practical path toward privacy-preserving radar sensing in IoT environments.
Abstract
Millimeter-Wave (mmWave) radar enables camera-free gesture recognition for Internet of Things (IoT) interfaces, with robustness to lighting variations and partial occlusions. However, recent studies reveal that its data can inadvertently encode biometric signatures, raising critical privacy challenges for IoT applications. In particular, we demonstrate that mmWave radar point cloud data can leak identity-related information in the absence of explicit identity labels. To address this risk, we propose {ImmCOGNITO}, a graph-based autoencoder that transforms radar gesture point clouds to preserve gesture-relevant structure while suppressing identity cues. The encoder first constructs a directed graph for each sequence using Temporal Graph KNN. Edges are defined to capture inter-frame temporal dynamics. A message-passing neural network with multi-head self-attention then aggregates local and global spatio-temporal context, and the global max-pooled feature is concatenated with the original features. The decoder then reconstructs a minimally perturbed point cloud that retains gesture discriminative attributes while achieving de-identification. Training jointly optimizes reconstruction, gesture-preservation, and de-identification objectives. Evaluations on two public datasets, PantoRad and MHomeGes, show that ImmCOGNITO substantially reduces identification accuracy while maintaining high gesture recognition performance.
