RISAR: RIS-assisted Human Activity Recognition with Commercial Wi-Fi Devices

Junshuo Liu; Yunlong Huang; Wei Yang; Zhe Li; Rujing Xiong; Tiebin Mi; Xin Shi; Robert C. Qiu

RISAR: RIS-assisted Human Activity Recognition with Commercial Wi-Fi Devices

Junshuo Liu, Yunlong Huang, Wei Yang, Zhe Li, Rujing Xiong, Tiebin Mi, Xin Shi, Robert C. Qiu

TL;DR

The paper tackles passive HAR in indoor environments where spatial diversity is limited by conventional Wi-Fi setups. It proposes RISAR, which couples a reconfigurable intelligent surface to increase spatial diversity with a high-dimensional factor model (HDFM) for denoising and feature extraction and a dual-stream spatial-temporal attention network (DS-STAN) for classification. Experimental results across hallways, meeting rooms, and offices show strong performance, with average accuracies reaching up to $97.26\%$ (Office) and $90.83\%$ (L-shaped hallway), outperforming PCA baselines and prior HAR methods. The work demonstrates RIS compatibility with commodity Wi-Fi devices and highlights the potential of combining RIS, random matrix theory, and attention-based deep learning for robust HAR in real environments.

Abstract

Human activity recognition (HAR) holds significant importance in smart homes, security, and healthcare. Existing systems face limitations because of the insufficient spatial diversity provided by a limited number of antennas. Furthermore, inefficiencies in noise reduction and feature extraction from sensing data pose challenges to recognition performance. This study presents a reconfigurable intelligent surface (RIS)-assisted passive human activity recognition (RISAR) method, compatible with commercial Wi-Fi devices. RISAR leverages a RIS to enhance the spatial diversity of Wi-Fi signals, effectively capturing a wider range of information distributed across the spatial domain. A novel high-dimensional factor model based on random matrix theory is proposed to address noise reduction and feature extraction in the temporal domain. A dual-stream spatial-temporal attention network model is developed to assign variable weights to different characteristics and sequences, mimicking human cognitive processes in prioritizing essential information. Experimental analysis shows that RISAR significantly outperforms existing HAR methods in accuracy and efficiency, achieving an average accuracy of 97.26%. These findings underscore RISAR's adaptability and potential as a robust activity recognition solution in real environments.

RISAR: RIS-assisted Human Activity Recognition with Commercial Wi-Fi Devices

TL;DR

(Office) and

(L-shaped hallway), outperforming PCA baselines and prior HAR methods. The work demonstrates RIS compatibility with commodity Wi-Fi devices and highlights the potential of combining RIS, random matrix theory, and attention-based deep learning for robust HAR in real environments.

Abstract

Paper Structure (18 sections, 1 theorem, 11 equations, 7 figures, 5 tables, 1 algorithm)

This paper contains 18 sections, 1 theorem, 11 equations, 7 figures, 5 tables, 1 algorithm.

Introduction
Related Work
System Architecture
Hardware Architecture and Testing Settings
Data Collection
Data Analysis and Preprocessing
Activity feature analysis
Data cleaning and compression
Learning-based Algorithms
Experiments and Results
Performance Evaluation of RIS-enabled HAR
Performance Evaluation of HDFM
Performance Evaluation of Amplitude-phase Fusion
Performance Evaluation of Attention Mechanism
Performance Evaluation with Other Existing Methods
...and 3 more sections

Key Result

Theorem 1

Let $\hat{\lambda}_{1, T} \geq \ldots \geq \hat{\lambda}_{N, T}$ denote the eigenvalues of matrix $\mathbf{R} \mathbf{R}^H$. Under Assumption assumption1, for $j=1,\ldots,p$, Moreover, almost surely (a.s.) for $T$ large enough.

Figures (7)

Figure 1: The schematic of the RIS-based human activity recognition system.
Figure 2: The architecture of the proposed RISAR system.
Figure 3: Three types of CSI data collection environments.
Figure 4: The spectrogram of one subcarrier's CSI amplitude for different activities: walking, picking up, and standing up (from left to right).
Figure 5: Schematic of our proposed dual-stream spatial-temporal attention deep learning framework. Given CSI amplitude and phase features, we first use the spatial-temporal extractors to extract its activity-related features. Then, these features are fused using the attention mechanism and passed to a dense layer. Finally, the softmax is used to give output for the activity category prediction.
...and 2 more figures

Theorems & Definitions (1)

Theorem 1

RISAR: RIS-assisted Human Activity Recognition with Commercial Wi-Fi Devices

TL;DR

Abstract

RISAR: RIS-assisted Human Activity Recognition with Commercial Wi-Fi Devices

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (1)