Table of Contents
Fetching ...

Explainable Deep Learning Framework for Human Activity Recognition

Yiran Huang, Yexu Zhou, Haibin Zhao, Till Riedel, Michael Beigl

TL;DR

This paper addresses the lack of transparent explanations in human activity recognition (HAR) by introducing OptiHAR, a model-agnostic framework that uses competitive data augmentation to both improve predictive performance and yield intuitive explanations. The method applies identical data augmentations during training and prediction, leveraging perturbations like jitter, Clip, and SegmentOut to reveal which data regions and frequency components drive model decisions, and uses a voting mechanism to aggregate predictions and explanations. Extensive experiments on five HAR benchmarks with three base architectures show consistent performance gains over a state-of-the-art baseline (ActivityGAN), with substantial improvements on several datasets and manageable inference overhead through parallelization. The work provides a practical pathway to trustworthy HAR systems by combining interpretability with accuracy, and releases code to facilitate broad adoption across HAR models.

Abstract

In the realm of human activity recognition (HAR), the integration of explainable Artificial Intelligence (XAI) emerges as a critical necessity to elucidate the decision-making processes of complex models, fostering transparency and trust. Traditional explanatory methods like Class Activation Mapping (CAM) and attention mechanisms, although effective in highlighting regions vital for decisions in various contexts, prove inadequate for HAR. This inadequacy stems from the inherently abstract nature of HAR data, rendering these explanations obscure. In contrast, state-of-th-art post-hoc interpretation techniques for time series can explain the model from other perspectives. However, this requires extra effort. It usually takes 10 to 20 seconds to generate an explanation. To overcome these challenges, we proposes a novel, model-agnostic framework that enhances both the interpretability and efficacy of HAR models through the strategic use of competitive data augmentation. This innovative approach does not rely on any particular model architecture, thereby broadening its applicability across various HAR models. By implementing competitive data augmentation, our framework provides intuitive and accessible explanations of model decisions, thereby significantly advancing the interpretability of HAR systems without compromising on performance.

Explainable Deep Learning Framework for Human Activity Recognition

TL;DR

This paper addresses the lack of transparent explanations in human activity recognition (HAR) by introducing OptiHAR, a model-agnostic framework that uses competitive data augmentation to both improve predictive performance and yield intuitive explanations. The method applies identical data augmentations during training and prediction, leveraging perturbations like jitter, Clip, and SegmentOut to reveal which data regions and frequency components drive model decisions, and uses a voting mechanism to aggregate predictions and explanations. Extensive experiments on five HAR benchmarks with three base architectures show consistent performance gains over a state-of-the-art baseline (ActivityGAN), with substantial improvements on several datasets and manageable inference overhead through parallelization. The work provides a practical pathway to trustworthy HAR systems by combining interpretability with accuracy, and releases code to facilitate broad adoption across HAR models.

Abstract

In the realm of human activity recognition (HAR), the integration of explainable Artificial Intelligence (XAI) emerges as a critical necessity to elucidate the decision-making processes of complex models, fostering transparency and trust. Traditional explanatory methods like Class Activation Mapping (CAM) and attention mechanisms, although effective in highlighting regions vital for decisions in various contexts, prove inadequate for HAR. This inadequacy stems from the inherently abstract nature of HAR data, rendering these explanations obscure. In contrast, state-of-th-art post-hoc interpretation techniques for time series can explain the model from other perspectives. However, this requires extra effort. It usually takes 10 to 20 seconds to generate an explanation. To overcome these challenges, we proposes a novel, model-agnostic framework that enhances both the interpretability and efficacy of HAR models through the strategic use of competitive data augmentation. This innovative approach does not rely on any particular model architecture, thereby broadening its applicability across various HAR models. By implementing competitive data augmentation, our framework provides intuitive and accessible explanations of model decisions, thereby significantly advancing the interpretability of HAR systems without compromising on performance.
Paper Structure (13 sections, 2 equations, 6 figures, 2 tables)

This paper contains 13 sections, 2 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: The proposed framework. The blue cubes visualize the techniques that have been implemented such as cawr, time series da, among others. On the other hand, the orange boxes denote the data flow and the red boxes denote the model.
  • Figure 2: Demonstration of the core idea. Triangles and circles represent different categories, and the lines in the figure depict the boundary lines of the different categories
  • Figure 3: Data Augmentation transformations selected in the framework.
  • Figure 4: An example of the explanation.
  • Figure 5: Benchmark model architecture. a MCNN. b DCL. c Transformer. Abbreviations: LSTM, long short-term memory layer; Conv1d, one dimensional convolutional layer; Conv2d, two dimensional convolutional layer; BatchNorm1d, one dimensional batch normalization; BatchNorm2d, two dimensional batch normalization; LayerNorm, layer normalization; MaxPool1d, one dimensional max pooling layer. Parameter: LSTM(hidden dimension number); Conv1d(filter number, kernel size, stride size), Conv2d(filter number, kernel size, stride size).
  • ...and 1 more figures