Table of Contents
Fetching ...

rTsfNet: a DNN model with Multi-head 3D Rotation and Time Series Feature Extraction for IMU-based Human Activity Recognition

Yu Enokibori

TL;DR

This paper targets the persistent challenge in IMU-based HAR of leveraging handcrafted time-series features alongside deep learning. It introduces rTsfNet, a DNN that automatically derives multiple 3D rotation bases and extracts time-series features, subsequently fusing them via a TSF Mixer and MLP classifier. Across benchmarks including UCI HAR, PAMAP2, Daphnet, and OPPORTUNITY, rTsfNet achieves state-of-the-art accuracy, with ablations showing that Multi-head 3D Rotation provides meaningful gains. The authors also establish an IMU-based HAR Benchmark to enable fair, open, and reproducible comparisons, and provide open-source code and models to support ongoing research. Overall, the approach demonstrates strong generalization across sensors, activities, and conditions, underlining the value of integrating rotation-based feature extraction with TSF-inspired representations in HAR.

Abstract

Although many deep learning (DL) algorithms have been proposed for the IMU-based HAR domain, traditional machine learning that utilizes handcrafted time series features (TSFs) still often performs well. It is not rare that combinations among DL and TSFs show better accuracy than DL-only approaches. However, there is a problem with time series features in IMU-based HAR. The amount of derived features can vary greatly depending on the method used to select the 3D basis. Fortunately, DL's strengths include capturing the features of input data and adaptively deriving parameters. Thus, as a new DNN model for IMU-based human activity recognition (HAR), this paper proposes rTsfNet, a DNN model with Multi-head 3D Rotation and Time Series Feature Extraction. rTsfNet automatically selects 3D bases from which features should be derived by extracting 3D rotation parameters within the DNN. Then, time series features (TSFs), based on many researchers' wisdom, are derived to achieve HAR using MLP. Although rTsfNet is a model that does not use CNN, it achieved higher accuracy than existing models under well-managed benchmark conditions and multiple datasets: UCI HAR, PAMAP2, Daphnet, and OPPORTUNITY, all of which target different activities.

rTsfNet: a DNN model with Multi-head 3D Rotation and Time Series Feature Extraction for IMU-based Human Activity Recognition

TL;DR

This paper targets the persistent challenge in IMU-based HAR of leveraging handcrafted time-series features alongside deep learning. It introduces rTsfNet, a DNN that automatically derives multiple 3D rotation bases and extracts time-series features, subsequently fusing them via a TSF Mixer and MLP classifier. Across benchmarks including UCI HAR, PAMAP2, Daphnet, and OPPORTUNITY, rTsfNet achieves state-of-the-art accuracy, with ablations showing that Multi-head 3D Rotation provides meaningful gains. The authors also establish an IMU-based HAR Benchmark to enable fair, open, and reproducible comparisons, and provide open-source code and models to support ongoing research. Overall, the approach demonstrates strong generalization across sensors, activities, and conditions, underlining the value of integrating rotation-based feature extraction with TSF-inspired representations in HAR.

Abstract

Although many deep learning (DL) algorithms have been proposed for the IMU-based HAR domain, traditional machine learning that utilizes handcrafted time series features (TSFs) still often performs well. It is not rare that combinations among DL and TSFs show better accuracy than DL-only approaches. However, there is a problem with time series features in IMU-based HAR. The amount of derived features can vary greatly depending on the method used to select the 3D basis. Fortunately, DL's strengths include capturing the features of input data and adaptively deriving parameters. Thus, as a new DNN model for IMU-based human activity recognition (HAR), this paper proposes rTsfNet, a DNN model with Multi-head 3D Rotation and Time Series Feature Extraction. rTsfNet automatically selects 3D bases from which features should be derived by extracting 3D rotation parameters within the DNN. Then, time series features (TSFs), based on many researchers' wisdom, are derived to achieve HAR using MLP. Although rTsfNet is a model that does not use CNN, it achieved higher accuracy than existing models under well-managed benchmark conditions and multiple datasets: UCI HAR, PAMAP2, Daphnet, and OPPORTUNITY, all of which target different activities.
Paper Structure (37 sections, 6 figures, 13 tables)

This paper contains 37 sections, 6 figures, 13 tables.

Figures (6)

  • Figure 1: rTsfNet
  • Figure 2: MLP Block
  • Figure 3: Tsf Mixer sub-Block
  • Figure 4: Tsf Mixer Block
  • Figure 5: Multi-head 3D Rotation Block
  • ...and 1 more figures