Explaining Human Activity Recognition with SHAP: Validating Insights with Perturbation and Quantitative Measures
Felix Tempel, Espen Alexander F. Ihlen, Lars Adde, Inga Strümke
TL;DR
This work integrates SHAP explanations with Graph Convolutional Networks for skeleton-based Human Activity Recognition, introducing ShapGCN to attribute feature contributions at the primary input level (J,V,B,A). A novel edge-matrix perturbation strategy tests the faithfulness of SHAP explanations by selectively masking influential body key points, quantified through PGI and PGU alongside standard metrics. The approach is validated on two real-world datasets: CP infant movement data and NTU RGB+D 60, showing that SHAP-identified key points exert the greatest influence on accuracy, specificity, and sensitivity, thereby enabling more interpretable, trustworthy HAR models for high-stakes domains like healthcare. The study discusses practical implications, computational costs, and potential extensions to biomarker discovery and real-time or broader-domain HAR applications.
Abstract
In Human Activity Recognition (HAR), understanding the intricacy of body movements within high-risk applications is essential. This study uses SHapley Additive exPlanations (SHAP) to explain the decision-making process of Graph Convolution Networks (GCNs) when classifying activities with skeleton data. We employ SHAP to explain two real-world datasets: one for cerebral palsy (CP) classification and the widely used NTU RGB+D 60 action recognition dataset. To test the explanation, we introduce a novel perturbation approach that modifies the model's edge importance matrix, allowing us to evaluate the impact of specific body key points on prediction outcomes. To assess the fidelity of our explanations, we employ informed perturbation, targeting body key points identified as important by SHAP and comparing them against random perturbation as a control condition. This perturbation enables a judgment on whether the body key points are truly influential or non-influential based on the SHAP values. Results on both datasets show that body key points identified as important through SHAP have the largest influence on the accuracy, specificity, and sensitivity metrics. Our findings highlight that SHAP can provide granular insights into the input feature contribution to the prediction outcome of GCNs in HAR tasks. This demonstrates the potential for more interpretable and trustworthy models in high-stakes applications like healthcare or rehabilitation.
