Choose Your Explanation: A Comparison of SHAP and GradCAM in Human Activity Recognition
Felix Tempel, Daniel Groos, Espen Alexander F. Ihlen, Lars Adde, Inga Strümke
TL;DR
The paper tackles explainability in graph-based human activity recognition by comparing SHAP and Grad-CAM on skeleton-based HAR across two real-world datasets. It provides both qualitative visualizations and quantitative perturbation analyses to evaluate how each method attributes importance to input features and body joints. The findings show SHAP delivers detailed feature-level attributions, while Grad-CAM offers quicker, spatial explanations, with notable differences in emphasis across networks and datasets; the authors argue for using these methods in a complementary fashion. This work informs how to choose and potentially hybrid XAI approaches for HAR in healthcare contexts, where trust and actionable insights are critical.
Abstract
Explaining machine learning (ML) models using eXplainable AI (XAI) techniques has become essential to make them more transparent and trustworthy. This is especially important in high-stakes domains like healthcare, where understanding model decisions is critical to ensure ethical, sound, and trustworthy outcome predictions. However, users are often confused about which explanability method to choose for their specific use case. We present a comparative analysis of widely used explainability methods, Shapley Additive Explanations (SHAP) and Gradient-weighted Class Activation Mapping (Grad-CAM), within the domain of human activity recognition (HAR) utilizing graph convolutional networks (GCNs). By evaluating these methods on skeleton-based data from two real-world datasets, including a healthcare-critical cerebral palsy (CP) case, this study provides vital insights into both approaches' strengths, limitations, and differences, offering a roadmap for selecting the most appropriate explanation method based on specific models and applications. We quantitatively and quantitatively compare these methods, focusing on feature importance ranking, interpretability, and model sensitivity through perturbation experiments. While SHAP provides detailed input feature attribution, Grad-CAM delivers faster, spatially oriented explanations, making both methods complementary depending on the application's requirements. Given the importance of XAI in enhancing trust and transparency in ML models, particularly in sensitive environments like healthcare, our research demonstrates how SHAP and Grad-CAM could complement each other to provide more interpretable and actionable model explanations.
