Table of Contents
Fetching ...

PIM: Physics-Informed Multi-task Pre-training for Improving Inertial Sensor-Based Human Activity Recognition

Dominique Nshimyimana, Vitor Fortes Rey, Sungho Suh, Bo Zhou, Paul Lukowicz

TL;DR

This work tackles the data-label bottleneck in inertial-sensor human activity recognition by introducing Physics-Informed Multi-task Pre-training (PIM), a self-supervised framework that integrates fundamental physical constraints of human motion into pretext tasks. By deriving three physics-based pseudo-labels—Speed of Motion, Angular features, and Symmetry (SAM-tasks)—and training a shared encoder with dedicated heads, PIM learns representations that generalize well with limited labeled data. Comprehensive experiments across four HAR benchmarks show that PIM consistently outperforms state-of-the-art SSL baselines in few-shot settings, with substantial gains in macro-F1 and accuracy when only a few examples per class are available. The method highlights the importance of embedding physical principles into SSL for wearables and points to future work on single-device deployment, cross-dataset transfer, and imbalanced-data handling.

Abstract

Human activity recognition (HAR) with deep learning models relies on large amounts of labeled data, often challenging to obtain due to associated cost, time, and labor. Self-supervised learning (SSL) has emerged as an effective approach to leverage unlabeled data through pretext tasks, such as masked reconstruction and multitask learning with signal processing-based data augmentations, to pre-train encoder models. However, such methods are often derived from computer vision approaches that disregard physical mechanisms and constraints that govern wearable sensor data and the phenomena they reflect. In this paper, we propose a physics-informed multi-task pre-training (PIM) framework for IMU-based HAR. PIM generates pre-text tasks based on the understanding of basic physical aspects of human motion: including movement speed, angles of movement, and symmetry between sensor placements. Given a sensor signal, we calculate corresponding features using physics-based equations and use them as pretext tasks for SSL. This enables the model to capture fundamental physical characteristics of human activities, which is especially relevant for multi-sensor systems. Experimental evaluations on four HAR benchmark datasets demonstrate that the proposed method outperforms existing state-of-the-art methods, including data augmentation and masked reconstruction, in terms of accuracy and F1 score. We have observed gains of almost 10\% in macro f1 score and accuracy with only 2 to 8 labeled examples per class and up to 3% when there is no reduction in the amount of training data.

PIM: Physics-Informed Multi-task Pre-training for Improving Inertial Sensor-Based Human Activity Recognition

TL;DR

This work tackles the data-label bottleneck in inertial-sensor human activity recognition by introducing Physics-Informed Multi-task Pre-training (PIM), a self-supervised framework that integrates fundamental physical constraints of human motion into pretext tasks. By deriving three physics-based pseudo-labels—Speed of Motion, Angular features, and Symmetry (SAM-tasks)—and training a shared encoder with dedicated heads, PIM learns representations that generalize well with limited labeled data. Comprehensive experiments across four HAR benchmarks show that PIM consistently outperforms state-of-the-art SSL baselines in few-shot settings, with substantial gains in macro-F1 and accuracy when only a few examples per class are available. The method highlights the importance of embedding physical principles into SSL for wearables and points to future work on single-device deployment, cross-dataset transfer, and imbalanced-data handling.

Abstract

Human activity recognition (HAR) with deep learning models relies on large amounts of labeled data, often challenging to obtain due to associated cost, time, and labor. Self-supervised learning (SSL) has emerged as an effective approach to leverage unlabeled data through pretext tasks, such as masked reconstruction and multitask learning with signal processing-based data augmentations, to pre-train encoder models. However, such methods are often derived from computer vision approaches that disregard physical mechanisms and constraints that govern wearable sensor data and the phenomena they reflect. In this paper, we propose a physics-informed multi-task pre-training (PIM) framework for IMU-based HAR. PIM generates pre-text tasks based on the understanding of basic physical aspects of human motion: including movement speed, angles of movement, and symmetry between sensor placements. Given a sensor signal, we calculate corresponding features using physics-based equations and use them as pretext tasks for SSL. This enables the model to capture fundamental physical characteristics of human activities, which is especially relevant for multi-sensor systems. Experimental evaluations on four HAR benchmark datasets demonstrate that the proposed method outperforms existing state-of-the-art methods, including data augmentation and masked reconstruction, in terms of accuracy and F1 score. We have observed gains of almost 10\% in macro f1 score and accuracy with only 2 to 8 labeled examples per class and up to 3% when there is no reduction in the amount of training data.

Paper Structure

This paper contains 19 sections, 1 equation, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Our Physics-Informed Pre-training framework for learning better representations for HAR. We have developed relevant physical quantities related to human motions that can be computed from unlabeled sensor data. Our pre-training consists of learning to predict discretized versions of those quantities. This encoder can then be fine-tuned together with a classification head using a small amount of labeled data to train a HAR classifier.
  • Figure 2: Histogram of the speed of motion pseudo-labels computed on DSADS dataset. We have colored the histograms per class to show the different distribution of speed features per class. As we can see, there some classes (such as 11) have high speeds, while many others have speed values in other ranges.
  • Figure 3: Distribution of angle physical quantities in the DSADS dataset for four devices. Rows represent roll, pitch and yaw while columns represents the different sensor positions. Histograms are colored by class to show pseudo-label distribution.
  • Figure 4: Histogram of pseudo-labels related to synchronization in the DSADS dataset for the legs (on the left) and arms (on the right). The histograms were colored per class to show how different classes fall into different bins for our pseudo-labels.
  • Figure 5: T-SNE of the physics-driven quantities before clustering, with points colored by activity class.
  • ...and 1 more figures