Table of Contents
Fetching ...

Autosen: improving automatic wifi human sensing through cross-modal autoencoder

Qian Gao, Yanling Hao, Yuanwei Liu

TL;DR

This work tackles the data labeling bottleneck in WiFi sensing by introducing AutoSen, a cross-modal autoencoder that directly links CSI amplitude and sanitized phase to learn informative latent features from unlabeled data. The encoder E extracts a latent representation from amplitude, while the decoder D reconstructs the phase, guided by a mean square error objective on the sanitized phase; a downstream few-shot learner then uses this encoder to perform activity recognition with few labels. Evaluations on the UT_HAR benchmark show that AutoSen outperforms single-modal unsupervised baselines and approaches the upper bound set by full supervision on amplitude, highlighting the value of cross-modal features and latent representation size (best at 256). Overall, AutoSen offers a practical path to improve automatic WiFi sensing under data scarcity by leveraging cross-modal correlations and minimal labeled data, thus enhancing generalizability and deployment viability.

Abstract

WiFi human sensing is highly regarded for its low-cost and privacy advantages in recognizing human activities. However, its effectiveness is largely confined to controlled, single-user, line-of-sight settings, limited by data collection complexities and the scarcity of labeled datasets. Traditional cross-modal methods, aimed at mitigating these limitations by enabling self-supervised learning without labeled data, struggle to extract meaningful features from amplitude-phase combinations. In response, we introduce AutoSen, an innovative automatic WiFi sensing solution that departs from conventional approaches. AutoSen establishes a direct link between amplitude and phase through automated cross-modal autoencoder learning. This autoencoder efficiently extracts valuable features from unlabeled CSI data, encompassing amplitude and phase information while eliminating their respective unique noises. These features are then leveraged for specific tasks using few-shot learning techniques. AutoSen's performance is rigorously evaluated on a publicly accessible benchmark dataset, demonstrating its exceptional capabilities in automatic WiFi sensing through the extraction of comprehensive cross-modal features.

Autosen: improving automatic wifi human sensing through cross-modal autoencoder

TL;DR

This work tackles the data labeling bottleneck in WiFi sensing by introducing AutoSen, a cross-modal autoencoder that directly links CSI amplitude and sanitized phase to learn informative latent features from unlabeled data. The encoder E extracts a latent representation from amplitude, while the decoder D reconstructs the phase, guided by a mean square error objective on the sanitized phase; a downstream few-shot learner then uses this encoder to perform activity recognition with few labels. Evaluations on the UT_HAR benchmark show that AutoSen outperforms single-modal unsupervised baselines and approaches the upper bound set by full supervision on amplitude, highlighting the value of cross-modal features and latent representation size (best at 256). Overall, AutoSen offers a practical path to improve automatic WiFi sensing under data scarcity by leveraging cross-modal correlations and minimal labeled data, thus enhancing generalizability and deployment viability.

Abstract

WiFi human sensing is highly regarded for its low-cost and privacy advantages in recognizing human activities. However, its effectiveness is largely confined to controlled, single-user, line-of-sight settings, limited by data collection complexities and the scarcity of labeled datasets. Traditional cross-modal methods, aimed at mitigating these limitations by enabling self-supervised learning without labeled data, struggle to extract meaningful features from amplitude-phase combinations. In response, we introduce AutoSen, an innovative automatic WiFi sensing solution that departs from conventional approaches. AutoSen establishes a direct link between amplitude and phase through automated cross-modal autoencoder learning. This autoencoder efficiently extracts valuable features from unlabeled CSI data, encompassing amplitude and phase information while eliminating their respective unique noises. These features are then leveraged for specific tasks using few-shot learning techniques. AutoSen's performance is rigorously evaluated on a publicly accessible benchmark dataset, demonstrating its exceptional capabilities in automatic WiFi sensing through the extraction of comprehensive cross-modal features.
Paper Structure (12 sections, 7 equations, 2 figures, 4 tables)

This paper contains 12 sections, 7 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Overview of proposed AutoSen model. The AutoSen consists of a cross-modal autoencoder module that extracts features from unlabeled CSI amplitude and phase, and a few-shot learning module that transfers the knowledge to specific tasks. Note that the encoder in few-shot learning is the one obtained in cross-modal autoencoder.
  • Figure 2: Illustration of how CSI changes in response to human activity.