Table of Contents
Fetching ...

Efficient Online Continual Learning in Sensor-Based Human Activity Recognition

Yao Zhang, Souza Leite Clayton, Yu Xiao

TL;DR

PTRN-HAR tackles the challenge of resource- and data-efficient online continual learning for sensor-based HAR. It freezes a contrastively pre-trained feature extractor and trains a relation module on replay embeddings, enabling continual learning with limited labeled data. The method demonstrates strong accuracy and Macro-F1 gains over state-of-the-art baselines across three datasets, while substantially reducing training cost, memory usage, and edge-device requirements. This combination of data efficiency and low resource demand makes PTRN-HAR particularly suitable for real-world HAR deployments.

Abstract

Machine learning models for sensor-based human activity recognition (HAR) are expected to adapt post-deployment to recognize new activities and different ways of performing existing ones. To address this need, Online Continual Learning (OCL) mechanisms have been proposed, allowing models to update their knowledge incrementally as new data become available while preserving previously acquired information. However, existing OCL approaches for sensor-based HAR are computationally intensive and require extensive labeled samples to represent new changes. Recently, pre-trained model-based (PTM-based) OCL approaches have shown significant improvements in performance and efficiency for computer vision applications. These methods achieve strong generalization capabilities by pre-training complex models on large datasets, followed by fine-tuning on downstream tasks for continual learning. However, applying PTM-based OCL approaches to sensor-based HAR poses significant challenges due to the inherent heterogeneity of HAR datasets and the scarcity of labeled data in post-deployment scenarios. This paper introduces PTRN-HAR, the first successful application of PTM-based OCL to sensor-based HAR. Unlike prior PTM-based OCL approaches, PTRN-HAR pre-trains the feature extractor using contrastive loss with a limited amount of data. This extractor is then frozen during the streaming stage. Furthermore, it replaces the conventional dense classification layer with a relation module network. Our design not only significantly reduces the resource consumption required for model training while maintaining high performance, but also improves data efficiency by reducing the amount of labeled data needed for effective continual learning, as demonstrated through experiments on three public datasets, outperforming the state-of-the-art. The code can be found here: https://anonymous.4open.science/r/PTRN-HAR-AF60/

Efficient Online Continual Learning in Sensor-Based Human Activity Recognition

TL;DR

PTRN-HAR tackles the challenge of resource- and data-efficient online continual learning for sensor-based HAR. It freezes a contrastively pre-trained feature extractor and trains a relation module on replay embeddings, enabling continual learning with limited labeled data. The method demonstrates strong accuracy and Macro-F1 gains over state-of-the-art baselines across three datasets, while substantially reducing training cost, memory usage, and edge-device requirements. This combination of data efficiency and low resource demand makes PTRN-HAR particularly suitable for real-world HAR deployments.

Abstract

Machine learning models for sensor-based human activity recognition (HAR) are expected to adapt post-deployment to recognize new activities and different ways of performing existing ones. To address this need, Online Continual Learning (OCL) mechanisms have been proposed, allowing models to update their knowledge incrementally as new data become available while preserving previously acquired information. However, existing OCL approaches for sensor-based HAR are computationally intensive and require extensive labeled samples to represent new changes. Recently, pre-trained model-based (PTM-based) OCL approaches have shown significant improvements in performance and efficiency for computer vision applications. These methods achieve strong generalization capabilities by pre-training complex models on large datasets, followed by fine-tuning on downstream tasks for continual learning. However, applying PTM-based OCL approaches to sensor-based HAR poses significant challenges due to the inherent heterogeneity of HAR datasets and the scarcity of labeled data in post-deployment scenarios. This paper introduces PTRN-HAR, the first successful application of PTM-based OCL to sensor-based HAR. Unlike prior PTM-based OCL approaches, PTRN-HAR pre-trains the feature extractor using contrastive loss with a limited amount of data. This extractor is then frozen during the streaming stage. Furthermore, it replaces the conventional dense classification layer with a relation module network. Our design not only significantly reduces the resource consumption required for model training while maintaining high performance, but also improves data efficiency by reducing the amount of labeled data needed for effective continual learning, as demonstrated through experiments on three public datasets, outperforming the state-of-the-art. The code can be found here: https://anonymous.4open.science/r/PTRN-HAR-AF60/

Paper Structure

This paper contains 25 sections, 12 equations, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: Application scenarios of OCL. Each scenario consists of two stages: pre-deployment and streaming. (a) Traditional OCL for sensor-based HAR typically train the model on base class data during the pre-deployment stage and continuously update the model with streaming data to achieve continual learning. (b) Previous rehearsal-based OCL methods for HAR, such as iCaRL rebuffiIcarlIncrementalClassifier2017 and OCL-HAR schiemerOnlineContinualLearning2023, select representative samples from streaming data to retrain the entire model. (c) PTM-Based OCL methods in the field of computer vision train the feature extractor on large and diverse datasets, and fine-tune the dense layer based on the streaming data. (d) PTNR-HAR extracts more general features through the utilization of contrastive loss. The feature extractor is frozen during the streaming stage, while the relation module is trained on the replay data, permitting higher performance and data efficiency.
  • Figure 2: Overall pipeline of PTRN-HAR. (a) In the pre-deployment stage, base class data are used to train the FE network. The embeddings output from the penultimate layer of the FE network are utilized as the features of the data. (b) In the streaming stage, the FE network is frozen, and RM network is used to reclassify the embeddings of streaming data. PTRN-HAR stores $N$ (default = 20) embeddings for each class as replay data, which is continuously updated based on the incoming labelled streaming data. When a new class emerges or there is a considerably change in the replay data (i.e., domain change), the RM network is retrained using the replay data to enable continual learning.
  • Figure 3: The PCA visualization for embeddings after each stage. Points in different colors represent data from different classes. (B1, B2, B3, B4, B5) denote the base classes, while (N1, N2, N3) represent the new classes. (a) In the pre-deployment stage, the embeddings extracted by the FE network are linearly separable across classes. (b) In the streaming stage, due to the emergence of new classes and the frozen FE network, the embeddings of new and old classes begin to overlap. (c) After reprocessing through the RM network, the embeddings of all classes become linearly separable again.
  • Figure 4: Dataset segmentation based on different OCL scenarios.
  • Figure 5: Comparison of Macro F1 score with different number of new classes in different OCL scenarios.
  • ...and 2 more figures