Diffusion Model-based Contrastive Learning for Human Activity Recognition
Chunjing Xiao, Yanhui Han, Wei Yang, Yane Hou, Fangzhan Shi, Kevin Chetty
TL;DR
This work tackles the generalization gap in WiFi CSI-based activity recognition caused by subject variability in motion habits. It introduces CLAR, a diffusion-model-based contrastive learning framework that combines a DDPM-based time-series augmentation module with an adaptive weighting strategy for positive sample pairs. The augmentation decomposes reference signals into high- and low-frequency components and injects them with step-dependent weights during diffusion to synthesize plausible new motion patterns, while adaptive weighting uses Dynamic Time Warping–based activity content estimates to emphasize informative positive pairs. Experiments on SignFi and DeepSeg with limited labeled data show that CLAR consistently outperforms state-of-the-art baselines, validating its effectiveness and potential for practical wireless sensing applications.
Abstract
WiFi Channel State Information (CSI)-based activity recognition has sparked numerous studies due to its widespread availability and privacy protection. However, when applied in practical applications, general CSI-based recognition models may face challenges related to the limited generalization capability, since individuals with different behavior habits will cause various fluctuations in CSI data and it is difficult to gather enough training data to cover all kinds of motion habits. To tackle this problem, we design a diffusion model-based Contrastive Learning framework for human Activity Recognition (CLAR) using WiFi CSI. On the basis of the contrastive learning framework, we primarily introduce two components for CLAR to enhance CSI-based activity recognition. To generate diverse augmented data and complement limited training data, we propose a diffusion model-based time series-specific augmentation model. In contrast to typical diffusion models that directly apply conditions to the generative process, potentially resulting in distorted CSI data, our tailored model dissects these condition into the high-frequency and low-frequency components, and then applies these conditions to the generative process with varying weights. This can alleviate data distortion and yield high-quality augmented data. To efficiently capture the difference of the sample importance, we present an adaptive weight algorithm. Different from typical contrastive learning methods which equally consider all the training samples, this algorithm adaptively adjusts the weights of positive sample pairs for learning better data representations. The experiments suggest that CLAR achieves significant gains compared to state-of-the-art methods.
