Table of Contents
Fetching ...

Machine Learning Techniques for Sensor-based Human Activity Recognition with Data Heterogeneity -- A Review

Xiaozhou Ye, Kouichi Sakurai, Nirmal Nair, Kevin I-Kai Wang

TL;DR

This survey addresses the challenge of data heterogeneity in sensor-based HAR by categorizing heterogeneity types (data modality, streaming, subject, spatial) and mapping them to state-of-the-art ML paradigms (dominantly transfer learning, with multi-view, continual, zero-/few-shot, federated, and domain-generalization methods). It synthesizes techniques across modalities (IMU, ambient, device-free), fusion strategies, and cross-modality transfers, and reviews methods for both streaming and static datasets. The paper summarizes public and task-specific datasets used to study heterogeneity and highlights open problems, such as cross-modality knowledge transfer and adaptive learning for unseen activities, while calling for richer, cross-domain datasets. Overall, the review provides a structured roadmap for building robust, personalized HAR systems that remain accurate across diverse devices, environments, and user populations, with implications for ubiquitous computing and health monitoring.

Abstract

Sensor-based Human Activity Recognition (HAR) is crucial in ubiquitous computing, analysing behaviours through multi-dimensional observations. Despite research progress, HAR confronts challenges, particularly in data distribution assumptions. Most studies often assume uniform data distributions across datasets, contrasting with the varied nature of practical sensor data in human activities. Addressing data heterogeneity issues can improve performance, reduce computational costs, and aid in developing personalized, adaptive models with less annotated data. This review investigates how machine learning addresses data heterogeneity in HAR, by categorizing data heterogeneity types, applying corresponding suitable machine learning methods, summarizing available datasets, and discussing future challenges.

Machine Learning Techniques for Sensor-based Human Activity Recognition with Data Heterogeneity -- A Review

TL;DR

This survey addresses the challenge of data heterogeneity in sensor-based HAR by categorizing heterogeneity types (data modality, streaming, subject, spatial) and mapping them to state-of-the-art ML paradigms (dominantly transfer learning, with multi-view, continual, zero-/few-shot, federated, and domain-generalization methods). It synthesizes techniques across modalities (IMU, ambient, device-free), fusion strategies, and cross-modality transfers, and reviews methods for both streaming and static datasets. The paper summarizes public and task-specific datasets used to study heterogeneity and highlights open problems, such as cross-modality knowledge transfer and adaptive learning for unseen activities, while calling for richer, cross-domain datasets. Overall, the review provides a structured roadmap for building robust, personalized HAR systems that remain accurate across diverse devices, environments, and user populations, with implications for ubiquitous computing and health monitoring.

Abstract

Sensor-based Human Activity Recognition (HAR) is crucial in ubiquitous computing, analysing behaviours through multi-dimensional observations. Despite research progress, HAR confronts challenges, particularly in data distribution assumptions. Most studies often assume uniform data distributions across datasets, contrasting with the varied nature of practical sensor data in human activities. Addressing data heterogeneity issues can improve performance, reduce computational costs, and aid in developing personalized, adaptive models with less annotated data. This review investigates how machine learning addresses data heterogeneity in HAR, by categorizing data heterogeneity types, applying corresponding suitable machine learning methods, summarizing available datasets, and discussing future challenges.
Paper Structure (47 sections, 2 figures, 15 tables)