Table of Contents
Fetching ...

Feature Fusion for Human Activity Recognition using Parameter-Optimized Multi-Stage Graph Convolutional Network and Transformer Models

Mohammad Belal, Taimur Hassan, Abdelfatah Ahmed, Ahmad Aljarah, Nael Alsheikh, Irfan Hussain

TL;DR

The findings prove that PO MS-GCN outperforms state-of-the-art models in human activity recognition, and feature fusion exceeded the PO-MS-GCN’s results in PKU-MMD, LARa, and TUG datasets.

Abstract

Human activity recognition (HAR) is a crucial area of research that involves understanding human movements using computer and machine vision technology. Deep learning has emerged as a powerful tool for this task, with models such as Convolutional Neural Networks (CNNs) and Transformers being employed to capture various aspects of human motion. One of the key contributions of this work is the demonstration of the effectiveness of feature fusion in improving HAR accuracy by capturing spatial and temporal features, which has important implications for the development of more accurate and robust activity recognition systems. The study uses sensory data from HuGaDB, PKU-MMD, LARa, and TUG datasets. Two model, the PO-MS-GCN and a Transformer were trained and evaluated, with PO-MS-GCN outperforming state-of-the-art models. HuGaDB and TUG achieved high accuracies and f1-scores, while LARa and PKU-MMD had lower scores. Feature fusion improved results across datasets.

Feature Fusion for Human Activity Recognition using Parameter-Optimized Multi-Stage Graph Convolutional Network and Transformer Models

TL;DR

The findings prove that PO MS-GCN outperforms state-of-the-art models in human activity recognition, and feature fusion exceeded the PO-MS-GCN’s results in PKU-MMD, LARa, and TUG datasets.

Abstract

Human activity recognition (HAR) is a crucial area of research that involves understanding human movements using computer and machine vision technology. Deep learning has emerged as a powerful tool for this task, with models such as Convolutional Neural Networks (CNNs) and Transformers being employed to capture various aspects of human motion. One of the key contributions of this work is the demonstration of the effectiveness of feature fusion in improving HAR accuracy by capturing spatial and temporal features, which has important implications for the development of more accurate and robust activity recognition systems. The study uses sensory data from HuGaDB, PKU-MMD, LARa, and TUG datasets. Two model, the PO-MS-GCN and a Transformer were trained and evaluated, with PO-MS-GCN outperforming state-of-the-art models. HuGaDB and TUG achieved high accuracies and f1-scores, while LARa and PKU-MMD had lower scores. Feature fusion improved results across datasets.

Paper Structure

This paper contains 16 sections, 6 equations, 1 figure, 3 tables.

Figures (1)

  • Figure 1: The features were gathered from the (PO-MS-GCN) and the Transformer, then both features were combined through concatenation. The combined features were then passed to the classifier to be used as an input.