Table of Contents
Fetching ...

UniCrossFi: A Unified Framework For Cross-Domain Wi-Fi-based Gesture Recognition

Ke Xu, Zhiyong Zheng, Hongyuan Zhu, Lei Wang, Jiangtao Wang

TL;DR

UniCrossFi tackles cross-domain generalization in CSI-based gesture recognition by unifying domain-generalization and semi-supervised domain generalization within a single framework. It introduces ARC, a physics-informed augmentation that leverages multi-antenna spatial diversity, and a Unified Contrastive Objective that combines $L_{UCon}$, $L_{SCon}$, and $L_{CE}$ to learn domain-invariant yet discriminative features across labeled and unlabeled data. Extensive experiments on Widar and CSIDA demonstrate state-of-the-art performance across DG, SSDG, and UDA benchmarks, with pronounced gains when labeled data are scarce. The results underscore the value of grounding contrastive learning in the physical properties of wireless signals to enable robust, deployable Wi-Fi sensing systems.

Abstract

Wi-Fi sensing systems are severely hindered by cross domain problem when deployed in unseen real-world environments. Existing methods typically design separate frameworks for either domain adaptation or domain generalization, often relying on extensive labeled data. Existing methods that designed for domain generalization is often relying on extensive labeled data. However, real-world scenarios are far more complex, where the deployed model must be capable of handling generalization under limited labeled source data. To this end, we propose UniCrossFi, a unified framework designed to mitigate performance drop in CSI-based sensing across diverse deployment settings. Our framework not only extends conventional Domain Generalization (DG) to a more practical Semi-Supervised Domain Generalization (SSDG) setting, where only partially labeled source data are available, but also introduces a physics-informed data augmentation strategy, Antenna Response Consistency (ARC). ARC mitigates the risk of learning superficial shortcuts by exploiting the intrinsic spatial diversity of multi-antenna systems, treating signals from different antennas as naturally augmented views of the same event. In addition, we design a Unified Contrastive Objective to prevent conventional contrastive learning from pushing apart samples from different domains that share the same class. We conduct extensive experiments on the public Widar and CSIDA datasets. The results demonstrate that UniCrossFi consistently establishes a new state-of-the-art, significantly outperforming existing methods across all unsupervised domain adaptation, DG, and SSDG benchmarks. UniCrossFi provides a principled and practical solution to the domain shift challenge, advancing the feasibility of robust, real-world Wi-Fi sensing systems that can operate effectively with limited labeled data.

UniCrossFi: A Unified Framework For Cross-Domain Wi-Fi-based Gesture Recognition

TL;DR

UniCrossFi tackles cross-domain generalization in CSI-based gesture recognition by unifying domain-generalization and semi-supervised domain generalization within a single framework. It introduces ARC, a physics-informed augmentation that leverages multi-antenna spatial diversity, and a Unified Contrastive Objective that combines , , and to learn domain-invariant yet discriminative features across labeled and unlabeled data. Extensive experiments on Widar and CSIDA demonstrate state-of-the-art performance across DG, SSDG, and UDA benchmarks, with pronounced gains when labeled data are scarce. The results underscore the value of grounding contrastive learning in the physical properties of wireless signals to enable robust, deployable Wi-Fi sensing systems.

Abstract

Wi-Fi sensing systems are severely hindered by cross domain problem when deployed in unseen real-world environments. Existing methods typically design separate frameworks for either domain adaptation or domain generalization, often relying on extensive labeled data. Existing methods that designed for domain generalization is often relying on extensive labeled data. However, real-world scenarios are far more complex, where the deployed model must be capable of handling generalization under limited labeled source data. To this end, we propose UniCrossFi, a unified framework designed to mitigate performance drop in CSI-based sensing across diverse deployment settings. Our framework not only extends conventional Domain Generalization (DG) to a more practical Semi-Supervised Domain Generalization (SSDG) setting, where only partially labeled source data are available, but also introduces a physics-informed data augmentation strategy, Antenna Response Consistency (ARC). ARC mitigates the risk of learning superficial shortcuts by exploiting the intrinsic spatial diversity of multi-antenna systems, treating signals from different antennas as naturally augmented views of the same event. In addition, we design a Unified Contrastive Objective to prevent conventional contrastive learning from pushing apart samples from different domains that share the same class. We conduct extensive experiments on the public Widar and CSIDA datasets. The results demonstrate that UniCrossFi consistently establishes a new state-of-the-art, significantly outperforming existing methods across all unsupervised domain adaptation, DG, and SSDG benchmarks. UniCrossFi provides a principled and practical solution to the domain shift challenge, advancing the feasibility of robust, real-world Wi-Fi sensing systems that can operate effectively with limited labeled data.
Paper Structure (36 sections, 15 equations, 6 figures, 7 tables)

This paper contains 36 sections, 15 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Cross-Domain settings. (a) Domain Generalization: Only labeled source-domain data are used for training, while target-domain data remain unseen during training. (b) Semi-Supervised Domain Generalization: Only a subset of the source-domain data is labeled, and no target-domain data are available during training.
  • Figure 2: Overview of the proposed UniCrossFi framework. The upper part of the figure presents the end-to-end workflow. Each CSI sample first undergoes preprocessing to remove phase offset and static path, mitigating the shift in $p(y|h)$. The processed signals from different antennas are treated as natural intra-domain augmented views, while one antenna’s sample is adaptively normalized to generate a cross-domain ARC-augmented view. All anchor and augmented samples are then fed into the loss module for joint optimization. The overall objective integrates supervised and unsupervised contrastive losses ($L_{SupCon}$ and $L_{Con}$) in a weighted form, together with a cross-entropy ($L_{CE}$) term for source-domain classification, to reduce undesired repulsion between same-class samples across domains. The lower part of the figure illustrates data flows under different domain adaptation settings: DG and SSDG. In DG, only labeled source-domain data are used for training. SSDG extends this by including both labeled and unlabeled source-domain data. All settings follow the same training pipeline within the proposed framework.
  • Figure 3: Overview of the proposed network architecture. The model takes the input signal and processes it sequentially through four main components: a convolutional block (CB), a residual block (RB), a downsample residual block (DRB), and an adaptive pooling layer. The CB consists of a 7×7 convolutional layer followed by batch normalization (BN), a ReLU activation, and a max pooling layer. The RB follows a standard residual structure with two consecutive 3×3 convolutional layers, each followed by BN and ReLU, and a residual shortcut connection. The DRB extends the RB by introducing an additional 1×1 convolution and BN in the residual path to enable downsampling.
  • Figure 4: The SSDG performance on Widar dataset for ERM, SimCLR and UniCrossFi.
  • Figure 5: Performance comparison of WiSDA, COTMIX, AdvSKM, and our method with and without TDCSI preprocessing on Widar cross-domain tasks under the domain generalization setting.
  • ...and 1 more figures