Table of Contents
Fetching ...

Consistency Based Weakly Self-Supervised Learning for Human Activity Recognition with Wearables

Taoran Sheng, Manfred Huber

TL;DR

This work tackles wearable HAR under limited labeling by introducing a weakly self-supervised learning framework that combines a ResNet-based autoencoder with Siamese networks. It enforces temporal and feature consistencies to shape a meaningful embedding space and uses a two-stage joint loss, first self-supervised and then lightly supervised via pairwise constraints, to refine clusters with few labels. Experimental results on PAMAP2, REALDISP, and SBHAR show substantial improvements over unsupervised baselines and competitive performance with only 10% of labels, drastically reducing labeling effort for HAR in ubiquitous sensing scenarios. The approach yields clustering-ready representations suitable for downstream classification, enabling scalable HAR with minimal annotation burden.

Abstract

While the widely available embedded sensors in smartphones and other wearable devices make it easier to obtain data of human activities, recognizing different types of human activities from sensor-based data remains a difficult research topic in ubiquitous computing. One reason for this is that most of the collected data is unlabeled. However, many current human activity recognition (HAR) systems are based on supervised methods, which heavily rely on the labels of the data. We describe a weakly self-supervised approach in this paper that consists of two stages: (1) In stage one, the model learns from the nature of human activities by projecting the data into an embedding space where similar activities are grouped together; (2) In stage two, the model is fine-tuned using similarity information in a few-shot learning fashion using the similarity information of the data. This allows downstream classification or clustering tasks to benefit from the embeddings. Experiments on three benchmark datasets demonstrate the framework's effectiveness and show that our approach can help the clustering algorithm achieve comparable performance in identifying and categorizing the underlying human activities as pure supervised techniques applied directly to a corresponding fully labeled data set.

Consistency Based Weakly Self-Supervised Learning for Human Activity Recognition with Wearables

TL;DR

This work tackles wearable HAR under limited labeling by introducing a weakly self-supervised learning framework that combines a ResNet-based autoencoder with Siamese networks. It enforces temporal and feature consistencies to shape a meaningful embedding space and uses a two-stage joint loss, first self-supervised and then lightly supervised via pairwise constraints, to refine clusters with few labels. Experimental results on PAMAP2, REALDISP, and SBHAR show substantial improvements over unsupervised baselines and competitive performance with only 10% of labels, drastically reducing labeling effort for HAR in ubiquitous sensing scenarios. The approach yields clustering-ready representations suitable for downstream classification, enabling scalable HAR with minimal annotation burden.

Abstract

While the widely available embedded sensors in smartphones and other wearable devices make it easier to obtain data of human activities, recognizing different types of human activities from sensor-based data remains a difficult research topic in ubiquitous computing. One reason for this is that most of the collected data is unlabeled. However, many current human activity recognition (HAR) systems are based on supervised methods, which heavily rely on the labels of the data. We describe a weakly self-supervised approach in this paper that consists of two stages: (1) In stage one, the model learns from the nature of human activities by projecting the data into an embedding space where similar activities are grouped together; (2) In stage two, the model is fine-tuned using similarity information in a few-shot learning fashion using the similarity information of the data. This allows downstream classification or clustering tasks to benefit from the embeddings. Experiments on three benchmark datasets demonstrate the framework's effectiveness and show that our approach can help the clustering algorithm achieve comparable performance in identifying and categorizing the underlying human activities as pure supervised techniques applied directly to a corresponding fully labeled data set.
Paper Structure (18 sections, 11 equations, 6 figures, 4 tables)

This paper contains 18 sections, 11 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: ResNet and Siamese Architecture.
  • Figure 2: The overall architecture of the proposed approach.
  • Figure 3: Trained Encoder for Classification/Clustering.
  • Figure 4: Representation space visualizations on the PAMAP2 dataset.
  • Figure 5: Representation space visualizations on the REALDISP dataset.
  • ...and 1 more figures