RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data
Maxwell A. Xu, Jaya Narain, Gregory Darnell, Haraldur Hallgrimsson, Hyewon Jeong, Darren Forde, Richard Fineman, Karthik J. Raghuram, James M. Rehg, Shirley Ren
TL;DR
RelCon addresses the lack of generalizable foundation models for health time-series by introducing a motif-based, learnable distance and a relative contrastive loss tailored to accelerometry. It pretrains on $1\times10^9$ samples from $87{,}376$ AHMS participants using a $256$-dimensional embedding produced by a $1$D ResNet-34 backbone, achieving state-of-the-art results across gait- and HAR-related tasks and demonstrating cross-task generalization. Key contributions include (i) a learnable, accelerometry-specific distance, (ii) a relative, hierarchical loss that preserves nuanced similarities, and (iii) extensive ablations showing the necessity of augmentations, RevIN, and within-subject dynamics for robust performance. The findings suggest that motion foundation models trained on real-world wearable data can generalize across diverse downstream analyses, with potential applicability to other biosignals and multi-location sensor settings.
Abstract
We present RelCon, a novel self-supervised Relative Contrastive learning approach for training a motion foundation model from wearable accelerometry sensors. First, a learnable distance measure is trained to capture motif similarity and domain-specific semantic information such as rotation invariance. Then, the learned distance provides a measurement of semantic similarity between a pair of accelerometry time-series, which we use to train our foundation model to model relative relationships across time and across subjects. The foundation model is trained on 1 billion segments from 87,376 participants, and achieves state-of-the-art performance across multiple downstream tasks, including human activity recognition and gait metric regression. To our knowledge, we are the first to show the generalizability of a foundation model with motion data from wearables across distinct evaluation tasks.
