Table of Contents
Fetching ...

WiRD-Gest: Gesture Recognition In The Real World Using Range-Doppler Wi-Fi Sensing on COTS Hardware

Jessica Sanson, Rahul C. Shah, Yazhou Zhu, Rafael Rosales, Valerio Frascolla

Abstract

Wi-Fi sensing has emerged as a promising technique for gesture recognition, yet its practical deployment is hindered by environmental sensitivity and device placement challenges. To overcome these limitations we propose Wi-Fi Range and Doppler (WiRD)-Gest, a novel system that performs gesture recognition using a single, unmodified Wi-Fi transceiver on a commercial off-the-shelf (COTS) laptop. The system leverages an monostatic full duplex sensing pipeline capable of extracting Range-Doppler (RD) information. Utilizing this, we present the first benchmark of deep learning models for gesture recognition based on monostatic sensing. The key innovation lies in how monostatic sensing and spatial (range) information fundamentally transforms accuracy, robustness and generalization compared to prior approaches. We demonstrate excellent performance in crowded, unseen public spaces with dynamic interference and additional moving targets even when trained on data from controlled environments only. These are scenarios where prior Wi-Fi sensing approaches often fail, however, our system suffers minor degradation. The WiRD-Gest benchmark and dataset will also be released as open source.

WiRD-Gest: Gesture Recognition In The Real World Using Range-Doppler Wi-Fi Sensing on COTS Hardware

Abstract

Wi-Fi sensing has emerged as a promising technique for gesture recognition, yet its practical deployment is hindered by environmental sensitivity and device placement challenges. To overcome these limitations we propose Wi-Fi Range and Doppler (WiRD)-Gest, a novel system that performs gesture recognition using a single, unmodified Wi-Fi transceiver on a commercial off-the-shelf (COTS) laptop. The system leverages an monostatic full duplex sensing pipeline capable of extracting Range-Doppler (RD) information. Utilizing this, we present the first benchmark of deep learning models for gesture recognition based on monostatic sensing. The key innovation lies in how monostatic sensing and spatial (range) information fundamentally transforms accuracy, robustness and generalization compared to prior approaches. We demonstrate excellent performance in crowded, unseen public spaces with dynamic interference and additional moving targets even when trained on data from controlled environments only. These are scenarios where prior Wi-Fi sensing approaches often fail, however, our system suffers minor degradation. The WiRD-Gest benchmark and dataset will also be released as open source.
Paper Structure (10 sections, 3 equations, 5 figures, 3 tables)

This paper contains 10 sections, 3 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Visual comparison of Wi-Fi sensing scenarios. (a) Bistatic sensing requires a separate transmitter and receiver. (b) The proposed system captures gestures using a single device.
  • Figure 2: Comparison of transceiver architectures. (a) Bistatic mode separates Tx and Rx on different hardware. (b) Monostatic mode utilizes a shared LO and Baseband processor on a single device.
  • Figure 3: RD maps (single frame) from the public space location showing (a) Up-Down and (b) Rotate gestures. Note the secondary moving targets in the background in both cases.
  • Figure 4: Training loss curves for different models
  • Figure 5: Velocity spectrograms for three consecutive rotate gestures, comparing effect of using range to filter out background motion on Doppler information. Top: controlled environment and bottom: public space. Left: With range filtering ($> 1m$) and right: Without range filtering (all subcarriers).