Toward Foundation Model for Multivariate Wearable Sensing of Physiological Signals
Yunfei Luo, Yuliang Chen, Asif Salekin, Tauhidur Rahman
TL;DR
The paper presents NormWear, a foundation model for multivariate wearable physiological signals that addresses cross-sensor heterogeneity through a channel-aware attention mechanism and a CLS-based liaison token. It introduces a CWT-based, multi-scale tokenization, a share-weighted per-channel encoder, and a memory-stream-inspired fusion with sensor-semantics alignment to bridge signals with text descriptions. Pretrained on ~2.5 million segments from diverse wearable datasets, NormWear achieves state-of-the-art generalization across 18 health-related tasks in zero-shot, partial-shot, and full-shot settings, demonstrating robust cross-modal and cross-domain transfer. This approach promises practical impact for wide-ranging wearable health applications by providing a modality- and channel-agnostic representation framework and a scalable pretraining strategy, with open-source code to support reproducibility.
Abstract
Time-series foundation models excel at tasks like forecasting across diverse data types by leveraging informative waveform representations. Wearable sensing data, however, pose unique challenges due to their variability in patterns and frequency bands, especially for healthcare-related outcomes. The main obstacle lies in crafting generalizable representations that adapt efficiently across heterogeneous sensing configurations and applications. To address this, we propose NormWear, the first multi-modal and ubiquitous foundation model designed to extract generalized and informative representations from wearable sensing data. Specifically, we design a channel-aware attention mechanism with a shared special liaison [CLS] token to detect signal patterns in both intra-sensor and inter-sensors. This helps the model to extract more meaningful information considering both time series themselves and the relationships between input sensors. This helps the model to be widely compatible with various sensors settings. NormWear is pretrained on a diverse set of physiological signals, including PPG, ECG, EEG, GSR, and IMU, from various public datasets. Our model shows exceptional generalizability across 11 public wearable sensing datasets, spanning 18 applications in mental health, body state inference, vital sign estimation, and disease risk evaluation. It consistently outperforms competitive baselines under zero-shot, partial-shot, and full-shot settings, indicating broad applicability in real-world health applications.
