Table of Contents
Fetching ...

WiFlexFormer: Efficient WiFi-Based Person-Centric Sensing

Julian Strohmayer, Matthias Wödlinger, Martin Kampel

TL;DR

This work benchmarks WiFlexFormer against state-of-the-art vision and specialized architectures for processing radio frequency data and demonstrates that it achieves comparable Human Activity Recognition (HAR) performance while offering a significantly lower parameter count and faster inference times.

Abstract

We propose WiFlexFormer, a highly efficient Transformer-based architecture designed for WiFi Channel State Information (CSI)-based person-centric sensing. We benchmark WiFlexFormer against state-of-the-art vision and specialized architectures for processing radio frequency data and demonstrate that it achieves comparable Human Activity Recognition (HAR) performance while offering a significantly lower parameter count and faster inference times. With an inference time of just 10 ms on an Nvidia Jetson Orin Nano, WiFlexFormer is optimized for real-time inference. Additionally, its low parameter count contributes to improved cross-domain generalization, where it often outperforms larger models. Our comprehensive evaluation shows that WiFlexFormer is a potential solution for efficient, scalable WiFi-based sensing applications. The PyTorch implementation of WiFlexFormer is publicly available at: https://github.com/StrohmayerJ/WiFlexFormer.

WiFlexFormer: Efficient WiFi-Based Person-Centric Sensing

TL;DR

This work benchmarks WiFlexFormer against state-of-the-art vision and specialized architectures for processing radio frequency data and demonstrates that it achieves comparable Human Activity Recognition (HAR) performance while offering a significantly lower parameter count and faster inference times.

Abstract

We propose WiFlexFormer, a highly efficient Transformer-based architecture designed for WiFi Channel State Information (CSI)-based person-centric sensing. We benchmark WiFlexFormer against state-of-the-art vision and specialized architectures for processing radio frequency data and demonstrate that it achieves comparable Human Activity Recognition (HAR) performance while offering a significantly lower parameter count and faster inference times. With an inference time of just 10 ms on an Nvidia Jetson Orin Nano, WiFlexFormer is optimized for real-time inference. Additionally, its low parameter count contributes to improved cross-domain generalization, where it often outperforms larger models. Our comprehensive evaluation shows that WiFlexFormer is a potential solution for efficient, scalable WiFi-based sensing applications. The PyTorch implementation of WiFlexFormer is publicly available at: https://github.com/StrohmayerJ/WiFlexFormer.

Paper Structure

This paper contains 18 sections, 3 figures, 6 tables.

Figures (3)

  • Figure 1: The proposed WiFlexFormer architecture. Convolution parameters are denoted as: [input channels $\Rightarrow$ number of filters, kernel size]. The final linear layer has 32 input features and $c$ output features, the number of classes. Only the output at the position of the class token is used for the prediction, the remaining positions are discarded.
  • Figure 2: (a-b) 3DO Recording setup over three consecutive days: (a) setup on days 1 and 2, and (b) setup on day 3, featuring static environmental variations due to furniture rearrangement. The transmitter-receiver arrangement and the designated activity area remain fixed throughout the experiment. (c) Widar3.0 recording setup featuring a single transmitter and six receivers.
  • Figure 3: Comparative analysis of subcarrier selection strategies for (a) amplitude features, and (b) DFS features using the 3DO dataset. Strategies include: (1) None: use of all subcarriers; (2) R$n$: random selection of $n$ subcarriers; (3) U$n$: uniform selection of every $n$th subcarrier; (4) B$n$-$m$: division into $n$ subcarrier bands with random selection of $m$ subcarriers from each band; and (5) PC$n$: selection of the first $n$ principal components.