Table of Contents
Fetching ...

Decoding Human Activities: Analyzing Wearable Accelerometer and Gyroscope Data for Activity Recognition

Utsab Saha, Sawradip Saha, Tahmid Kabir, Shaikh Anowarul Fattah, Mohammad Saquib

TL;DR

This work tackles wearable HAR by addressing overlap between static and dynamic activities through FusionActNet, a two-pathway architecture with static and dynamic residual networks plus a guidance module for fusion. A two-stage training scheme pre-trains the static and dynamic experts before optimizing the guidance module to produce a final prediction $y_{\text{pred}} = g_x y_{\text{s}} + (1-g_x) y_{\text{d}}$. On UCI-HAR and MotionSense, the method achieves $97.35\%$ and $95.35\%$ accuracy, respectively, outperforming prior approaches and demonstrating stability in overlapping activity scenarios. The approach offers practical benefits for robotics and surveillance by enabling precise activity understanding and responsive interventions, with potential extensions to other data modalities.

Abstract

A person's movement or relative positioning can be effectively captured by different types of sensors and corresponding sensor output can be utilized in various manipulative techniques for the classification of different human activities. This letter proposes an effective scheme for human activity recognition, which introduces two unique approaches within a multi-structural architecture, named FusionActNet. The first approach aims to capture the static and dynamic behavior of a particular action by using two dedicated residual networks and the second approach facilitates the final decision-making process by introducing a guidance module. A two-stage training process is designed where at the first stage, residual networks are pre-trained separately by using static (where the human body is immobile) and dynamic (involving movement of the human body) data. In the next stage, the guidance module along with the pre-trained static or dynamic models are used to train the given sensor data. Here the guidance module learns to emphasize the most relevant prediction vector obtained from the static or dynamic models, which helps to effectively classify different human activities. The proposed scheme is evaluated using two benchmark datasets and compared with state-of-the-art methods. The results clearly demonstrate that our method outperforms existing approaches in terms of accuracy, precision, recall, and F1 score, achieving 97.35% and 95.35% accuracy on the UCI HAR and Motion-Sense datasets, respectively which highlights both the effectiveness and stability of the proposed scheme.

Decoding Human Activities: Analyzing Wearable Accelerometer and Gyroscope Data for Activity Recognition

TL;DR

This work tackles wearable HAR by addressing overlap between static and dynamic activities through FusionActNet, a two-pathway architecture with static and dynamic residual networks plus a guidance module for fusion. A two-stage training scheme pre-trains the static and dynamic experts before optimizing the guidance module to produce a final prediction . On UCI-HAR and MotionSense, the method achieves and accuracy, respectively, outperforming prior approaches and demonstrating stability in overlapping activity scenarios. The approach offers practical benefits for robotics and surveillance by enabling precise activity understanding and responsive interventions, with potential extensions to other data modalities.

Abstract

A person's movement or relative positioning can be effectively captured by different types of sensors and corresponding sensor output can be utilized in various manipulative techniques for the classification of different human activities. This letter proposes an effective scheme for human activity recognition, which introduces two unique approaches within a multi-structural architecture, named FusionActNet. The first approach aims to capture the static and dynamic behavior of a particular action by using two dedicated residual networks and the second approach facilitates the final decision-making process by introducing a guidance module. A two-stage training process is designed where at the first stage, residual networks are pre-trained separately by using static (where the human body is immobile) and dynamic (involving movement of the human body) data. In the next stage, the guidance module along with the pre-trained static or dynamic models are used to train the given sensor data. Here the guidance module learns to emphasize the most relevant prediction vector obtained from the static or dynamic models, which helps to effectively classify different human activities. The proposed scheme is evaluated using two benchmark datasets and compared with state-of-the-art methods. The results clearly demonstrate that our method outperforms existing approaches in terms of accuracy, precision, recall, and F1 score, achieving 97.35% and 95.35% accuracy on the UCI HAR and Motion-Sense datasets, respectively which highlights both the effectiveness and stability of the proposed scheme.
Paper Structure (10 sections, 3 equations, 2 figures, 3 tables)

This paper contains 10 sections, 3 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Simplified architecture of proposed FusionActNet. (a) Stage I (b) Stage II
  • Figure 2: Detailed architecture of FusionActNet. The input signals are represented by $\mathbfit{X}_\text{input}$. The output of the model is one of the six activities [walking (WA), walking upstairs (WU), walking downstairs (WD), sitting (SI), standing (ST), and laying (LA).]