CSI4Free: GAN-Augmented mmWave CSI for Improved Pose Classification
Nabeel Nisar Bhat, Rafael Berkvens, Jeroen Famaey
TL;DR
The paper addresses data scarcity in mmWave CSI-based pose classification for COTS Wi-Fi by training a conditional Wasserstein GAN (cWGAN) to synthesize $30{,}000$ CSI samples per user, expanding the real dataset from $1{,}084$ to $31{,}184$ samples. Conditioning on pose labels, the cWGAN uses a gradient-penalized Wasserstein loss to achieve stable training and high-quality synthetic samples, validated via GAN-train and GAN-test metrics and improvements in pose-classification accuracy. The key contributions are the demonstration of stable cWGAN-based CSI augmentation for mmWave COTS data, the creation of a large synthetic CSI dataset, and quantified improvements in generalization across three users and eight poses. The approach reduces data-collection burden and enables broader JC&S research at mmWave frequencies, with potential for domain adaptation and transfer to related sensing tasks.
Abstract
In recent years, Joint Communication and Sensing (JC&S), has demonstrated significant success, particularly in utilizing sub-6 GHz frequencies with commercial-off-the-shelf (COTS) Wi-Fi devices for applications such as localization, gesture recognition, and pose classification. Deep learning and the existence of large public datasets has been pivotal in achieving such results. However, at mmWave frequencies (30-300 GHz), which has shown potential for more accurate sensing performance, there is a noticeable lack of research in the domain of COTS Wi-Fi sensing. Challenges such as limited research hardware, the absence of large datasets, limited functionality in COTS hardware, and the complexities of data collection present obstacles to a comprehensive exploration of this field. In this work, we aim to address these challenges by developing a method that can generate synthetic mmWave channel state information (CSI) samples. In particular, we use a generative adversarial network (GAN) on an existing dataset, to generate 30,000 additional CSI samples. The augmented samples exhibit a remarkable degree of consistency with the original data, as indicated by the notably high GAN-train and GAN-test scores. Furthermore, we integrate the augmented samples in training a pose classification model. We observe that the augmented samples complement the real data and improve the generalization of the classification model.
