Table of Contents
Fetching ...

CrossFi: A Cross Domain Wi-Fi Sensing Framework Based on Siamese Network

Zijian Zhao, Tingwei Chen, Zhijie Cai, Xiaoyang Li, Hang Li, Qimei Chen, Guangxu Zhu

TL;DR

CrossFi addresses the critical problem of cross-domain robustness in Wi-Fi sensing by introducing CSi-Net, an attention-based similarity calculator, and Weight-Net, an adaptive template generator. The framework extends Siamese networks to support in-domain, few-shot, zero-shot, and new-class scenarios, using data-driven template generation and domain-adaptive training (including optional MK-MMD) to bridge domain gaps. Empirical results on the WiGesture CSI-BERT dataset demonstrate state-of-the-art performance in gesture recognition and people identification across all target scenarios, with strong ablations validating the design choices. The approach enables practical, device-free sensing in IoT deployments and shows promise for broader cross-domain applications beyond Wi-Fi sensing.

Abstract

In recent years, Wi-Fi sensing has garnered significant attention due to its numerous benefits, such as privacy protection, low cost, and penetration ability. Extensive research has been conducted in this field, focusing on areas such as gesture recognition, people identification, and fall detection. However, many data-driven methods encounter challenges related to domain shift, where the model fails to perform well in environments different from the training data. One major factor contributing to this issue is the limited availability of Wi-Fi sensing datasets, which makes models learn excessive irrelevant information and over-fit to the training set. Unfortunately, collecting large-scale Wi-Fi sensing datasets across diverse scenarios is a challenging task. To address this problem, we propose CrossFi, a siamese network-based approach that excels in both in-domain scenario and cross-domain scenario, including few-shot, zero-shot scenarios, and even works in few-shot new-class scenario where testing set contains new categories. The core component of CrossFi is a sample-similarity calculation network called CSi-Net, which improves the structure of the siamese network by using an attention mechanism to capture similarity information, instead of simply calculating the distance or cosine similarity. Based on it, we develop an extra Weight-Net that can generate a template for each class, so that our CrossFi can work in different scenarios. Experimental results demonstrate that our CrossFi achieves state-of-the-art performance across various scenarios. In gesture recognition task, our CrossFi achieves an accuracy of 98.17% in in-domain scenario, 91.72% in one-shot cross-domain scenario, 64.81% in zero-shot cross-domain scenario, and 84.75% in one-shot new-class scenario. The code for our model is publicly available at https://github.com/RS2002/CrossFi.

CrossFi: A Cross Domain Wi-Fi Sensing Framework Based on Siamese Network

TL;DR

CrossFi addresses the critical problem of cross-domain robustness in Wi-Fi sensing by introducing CSi-Net, an attention-based similarity calculator, and Weight-Net, an adaptive template generator. The framework extends Siamese networks to support in-domain, few-shot, zero-shot, and new-class scenarios, using data-driven template generation and domain-adaptive training (including optional MK-MMD) to bridge domain gaps. Empirical results on the WiGesture CSI-BERT dataset demonstrate state-of-the-art performance in gesture recognition and people identification across all target scenarios, with strong ablations validating the design choices. The approach enables practical, device-free sensing in IoT deployments and shows promise for broader cross-domain applications beyond Wi-Fi sensing.

Abstract

In recent years, Wi-Fi sensing has garnered significant attention due to its numerous benefits, such as privacy protection, low cost, and penetration ability. Extensive research has been conducted in this field, focusing on areas such as gesture recognition, people identification, and fall detection. However, many data-driven methods encounter challenges related to domain shift, where the model fails to perform well in environments different from the training data. One major factor contributing to this issue is the limited availability of Wi-Fi sensing datasets, which makes models learn excessive irrelevant information and over-fit to the training set. Unfortunately, collecting large-scale Wi-Fi sensing datasets across diverse scenarios is a challenging task. To address this problem, we propose CrossFi, a siamese network-based approach that excels in both in-domain scenario and cross-domain scenario, including few-shot, zero-shot scenarios, and even works in few-shot new-class scenario where testing set contains new categories. The core component of CrossFi is a sample-similarity calculation network called CSi-Net, which improves the structure of the siamese network by using an attention mechanism to capture similarity information, instead of simply calculating the distance or cosine similarity. Based on it, we develop an extra Weight-Net that can generate a template for each class, so that our CrossFi can work in different scenarios. Experimental results demonstrate that our CrossFi achieves state-of-the-art performance across various scenarios. In gesture recognition task, our CrossFi achieves an accuracy of 98.17% in in-domain scenario, 91.72% in one-shot cross-domain scenario, 64.81% in zero-shot cross-domain scenario, and 84.75% in one-shot new-class scenario. The code for our model is publicly available at https://github.com/RS2002/CrossFi.
Paper Structure (39 sections, 10 equations, 15 figures, 9 tables, 2 algorithms)

This paper contains 39 sections, 10 equations, 15 figures, 9 tables, 2 algorithms.

Figures (15)

  • Figure 1: Comparison Between Siamese Network and Traditional Classification Network: Different shapes represent different categories. The blue and red items represent samples from the source domain and target domain, respectively. The green line represents the classification boundary. They remain consistent across the following figures.
  • Figure 2: Workflow: Our model can be organized into four main phases: data collection, data pre-processing, training, and inference. The training phase encompasses two stages, namely comparative learning and template learning. The red chapter 'f' represents a function for template generation, which is weighted average operation in in-domain and few-shot scenarios and argmax operation in zero-shot scenario. It remains consistent in Fig. \ref{['Weight-Net']}.
  • Figure 3: Architecture of CSi-Net: CSi-Net utilizes ResNet as a feature extractor and employs a multi-attention mechanism to compute similarity.
  • Figure 4: Illustration of the template generation method . The proposed Weight-Net is presented within the dashed box. Here, $k, t, D, n$ represent the sample number, packet number, number of sub-carriers across all antennas, and class number, respectively.
  • Figure 5: Effect of MK-MMD: The blue and red points represent samples from the source domain and target domain, respectively. The green line indicates the classification boundary. MK-MMD helps align the data distributions of the source and target domains in feature space, thereby enabling the classification boundary to function effectively in the target domain.
  • ...and 10 more figures