Table of Contents
Fetching ...

CSI-Bench: A Large-Scale In-the-Wild Dataset for Multi-task WiFi Sensing

Guozhen Zhu, Yuqian Hu, Weihang Gao, Wei-Hsiang Wang, Beibei Wang, K. J. Ray Liu

TL;DR

CSI-Bench addresses the lack of large-scale real-world WiFi sensing data by providing 461 hours of amplitude CSI collected from 35 users across 26 environments and 16 device types in-the-wild. It supports seven single-task and three multi-task objectives, with standardized splits and baseline results, enabling robust evaluation of both single-task and multi-task models, including parameter-efficient adapters. The experiments reveal strong performance of transformer-based architectures but reveal notable generalization gaps under out-of-distribution conditions (cross-user, cross-environment, cross-device), underscoring the need for domain-adaptive methods. The dataset also demonstrates practical benefits for edge deployment, with multi-task learning reducing model size and training time while maintaining accuracy. By releasing code and data, CSI-Bench provides a scalable platform for privacy-preserving health sensing and broader human-centric WiFi applications.

Abstract

WiFi sensing has emerged as a compelling contactless modality for human activity monitoring by capturing fine-grained variations in Channel State Information (CSI). Its ability to operate continuously and non-intrusively while preserving user privacy makes it particularly suitable for health monitoring. However, existing WiFi sensing systems struggle to generalize in real-world settings, largely due to datasets collected in controlled environments with homogeneous hardware and fragmented, session-based recordings that fail to reflect continuous daily activity. We present CSI-Bench, a large-scale, in-the-wild benchmark dataset collected using commercial WiFi edge devices across 26 diverse indoor environments with 35 real users. Spanning over 461 hours of effective data, CSI-Bench captures realistic signal variability under natural conditions. It includes task-specific datasets for fall detection, breathing monitoring, localization, and motion source recognition, as well as a co-labeled multitask dataset with joint annotations for user identity, activity, and proximity. To support the development of robust and generalizable models, CSI-Bench provides standardized evaluation splits and baseline results for both single-task and multi-task learning. CSI-Bench offers a foundation for scalable, privacy-preserving WiFi sensing systems in health and broader human-centric applications.

CSI-Bench: A Large-Scale In-the-Wild Dataset for Multi-task WiFi Sensing

TL;DR

CSI-Bench addresses the lack of large-scale real-world WiFi sensing data by providing 461 hours of amplitude CSI collected from 35 users across 26 environments and 16 device types in-the-wild. It supports seven single-task and three multi-task objectives, with standardized splits and baseline results, enabling robust evaluation of both single-task and multi-task models, including parameter-efficient adapters. The experiments reveal strong performance of transformer-based architectures but reveal notable generalization gaps under out-of-distribution conditions (cross-user, cross-environment, cross-device), underscoring the need for domain-adaptive methods. The dataset also demonstrates practical benefits for edge deployment, with multi-task learning reducing model size and training time while maintaining accuracy. By releasing code and data, CSI-Bench provides a scalable platform for privacy-preserving health sensing and broader human-centric WiFi applications.

Abstract

WiFi sensing has emerged as a compelling contactless modality for human activity monitoring by capturing fine-grained variations in Channel State Information (CSI). Its ability to operate continuously and non-intrusively while preserving user privacy makes it particularly suitable for health monitoring. However, existing WiFi sensing systems struggle to generalize in real-world settings, largely due to datasets collected in controlled environments with homogeneous hardware and fragmented, session-based recordings that fail to reflect continuous daily activity. We present CSI-Bench, a large-scale, in-the-wild benchmark dataset collected using commercial WiFi edge devices across 26 diverse indoor environments with 35 real users. Spanning over 461 hours of effective data, CSI-Bench captures realistic signal variability under natural conditions. It includes task-specific datasets for fall detection, breathing monitoring, localization, and motion source recognition, as well as a co-labeled multitask dataset with joint annotations for user identity, activity, and proximity. To support the development of robust and generalizable models, CSI-Bench provides standardized evaluation splits and baseline results for both single-task and multi-task learning. CSI-Bench offers a foundation for scalable, privacy-preserving WiFi sensing systems in health and broader human-centric applications.

Paper Structure

This paper contains 49 sections, 5 figures, 15 tables.

Figures (5)

  • Figure 1: CSI-Bench overview. The benchmark features multiple commercial routers and IoT devices deployed in real homes and offices to collect CSI data. It supports a wide range of human-centric sensing tasks, enabling robust model development across diverse hardware setups and real-world scenarios.
  • Figure 2: Representative CSI samples are shown for various scenarios, including human actions (jumping, running, walking, hand waving, falling, breathing), non-human motions (pet movement, iRobot, fan), and empty environments. In each sample, the x-axis represents time, and the y-axis represents the subcarrier index.
  • Figure 3: MATLAB-based CSI verification tool. (a) Visualization of CSI quality from three devices, showing variations in sampling interval, time-subcarrier heatmap, and amplitude response. (b) User interface for parsing and evaluating CSI data, supporting timestamp checks, amplitude analysis, and figure export to ensure data reliability in CSI-Bench.
  • Figure 4: Average CSI quality scores of 16 widely used IoT devices evaluated using our CSI verification tool. Each bar represents the mean score across five measurement trials, with error bars indicating the standard deviation.
  • Figure 5: Screenshot of the Google Spreadsheet–based annotation tool used by participants to record activities and timestamps during data collection.