Table of Contents
Fetching ...

Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Recognition

M Tanjid Hasan Tonmoy, Saif Mahmud, A K M Mahbubur Rahman, M Ashraful Amin, Amin Ahsan Ali

TL;DR

Open-set recognition in wearable HAR is addressed by a hierarchical self-attention autoencoder that fuses spatial and temporal sensor information. The model comprises a Hierarchical Window Encoder and a Session Encoder with modular and aggregator self-attention, trained under an ELBO-based reconstruction objective to detect unseen activities. It outperforms state-of-the-art baselines on five public HAR datasets and demonstrates robustness to noise and subject variability, with attention maps offering interpretable insights into sensor and temporal importance. This approach provides a scalable, explainable framework for reliable open-set HAR in real-world wearable sensing.

Abstract

Wearable sensor based human activity recognition is a challenging problem due to difficulty in modeling spatial and temporal dependencies of sensor signals. Recognition models in closed-set assumption are forced to yield members of known activity classes as prediction. However, activity recognition models can encounter an unseen activity due to body-worn sensor malfunction or disability of the subject performing the activities. This problem can be addressed through modeling solution according to the assumption of open-set recognition. Hence, the proposed self attention based approach combines data hierarchically from different sensor placements across time to classify closed-set activities and it obtains notable performance improvement over state-of-the-art models on five publicly available datasets. The decoder in this autoencoder architecture incorporates self-attention based feature representations from encoder to detect unseen activity classes in open-set recognition setting. Furthermore, attention maps generated by the hierarchical model demonstrate explainable selection of features in activity recognition. We conduct extensive leave one subject out validation experiments that indicate significantly improved robustness to noise and subject specific variability in body-worn sensor signals. The source code is available at: github.com/saif-mahmud/hierarchical-attention-HAR

Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Recognition

TL;DR

Open-set recognition in wearable HAR is addressed by a hierarchical self-attention autoencoder that fuses spatial and temporal sensor information. The model comprises a Hierarchical Window Encoder and a Session Encoder with modular and aggregator self-attention, trained under an ELBO-based reconstruction objective to detect unseen activities. It outperforms state-of-the-art baselines on five public HAR datasets and demonstrates robustness to noise and subject variability, with attention maps offering interpretable insights into sensor and temporal importance. This approach provides a scalable, explainable framework for reliable open-set HAR in real-world wearable sensing.

Abstract

Wearable sensor based human activity recognition is a challenging problem due to difficulty in modeling spatial and temporal dependencies of sensor signals. Recognition models in closed-set assumption are forced to yield members of known activity classes as prediction. However, activity recognition models can encounter an unseen activity due to body-worn sensor malfunction or disability of the subject performing the activities. This problem can be addressed through modeling solution according to the assumption of open-set recognition. Hence, the proposed self attention based approach combines data hierarchically from different sensor placements across time to classify closed-set activities and it obtains notable performance improvement over state-of-the-art models on five publicly available datasets. The decoder in this autoencoder architecture incorporates self-attention based feature representations from encoder to detect unseen activity classes in open-set recognition setting. Furthermore, attention maps generated by the hierarchical model demonstrate explainable selection of features in activity recognition. We conduct extensive leave one subject out validation experiments that indicate significantly improved robustness to noise and subject specific variability in body-worn sensor signals. The source code is available at: github.com/saif-mahmud/hierarchical-attention-HAR

Paper Structure

This paper contains 10 sections, 11 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Overview of the (a) Hierarchical Self Attention Encoder, consisting of Hierarchical window encoders and session encoder, is used to obtain representation for classification and open set detection (b) Hierarchical Window Encoder, where sensor signals from different body locations within short time span are separately transformed and fused later using self attention
  • Figure 2: Overview of the autoencoder architecture where representations from hierarchical self attention encoder is utilized for closed-set classification and reconstructed with decoder for open-set recognition
  • Figure 3: Attention map for activity 'Cleanup' from Opportunity dataset comprising locomotion and mid-level gestures (top two rows - plotted using ground truth annotation), bottom $x$-axis shows temporal attention weight and $y$-axis indicates body locations of sensors [(L = Left, R = Right), (L = Lower, U = Upper) & A = Arm] where darker color indicates higher attention score