Table of Contents
Fetching ...

Motion2Meaning: A Clinician-Centered Framework for Contestable LLM in Parkinson's Disease Gait Interpretation

Loc Phuc Truong Nguyen, Hung Thanh Do, Hung Truong Thanh Nguyen, Hung Cao

TL;DR

Motion2Meaning tackles the opacity of AI-driven gait interpretation in Parkinson's disease by introducing a clinician-centered, contestable AI framework. It integrates a 1D-CNN that predicts Hoehn & Yahr stages from vGRF data with a Cross-Modal Explanation Discrepancy (XMED) safeguard and a contestable LLM-based interaction layer in a Gait Data Visualization Interface and Contestable Interpretation Interface. The system achieves strong predictive performance (0.899 accuracy) while enabling clinicians to audit, challenge, and override AI decisions, aided by human-centered metrics for readability, grounding, and self-correction. The work demonstrates the feasibility of auditable, dialogic AI in PD care and outlines avenues for multi-modal fusion, clinical validation, and evaluation metric development to advance trustworthy AI in healthcare.

Abstract

AI-assisted gait analysis holds promise for improving Parkinson's Disease (PD) care, but current clinical dashboards lack transparency and offer no meaningful way for clinicians to interrogate or contest AI decisions. To address this issue, we present Motion2Meaning, a clinician-centered framework that advances Contestable AI through a tightly integrated interface designed for interpretability, oversight, and procedural recourse. Our approach leverages vertical Ground Reaction Force (vGRF) time-series data from wearable sensors as an objective biomarker of PD motor states. The system comprises three key components: a Gait Data Visualization Interface (GDVI), a one-dimensional Convolutional Neural Network (1D-CNN) that predicts Hoehn & Yahr severity stages, and a Contestable Interpretation Interface (CII) that combines our novel Cross-Modal Explanation Discrepancy (XMED) safeguard with a contestable Large Language Model (LLM). Our 1D-CNN achieves 89.0% F1-score on the public PhysioNet gait dataset. XMED successfully identifies model unreliability by detecting a five-fold increase in explanation discrepancies in incorrect predictions (7.45%) compared to correct ones (1.56%), while our LLM-powered interface enables clinicians to validate correct predictions and successfully contest a portion of the model's errors. A human-centered evaluation of this contestable interface reveals a crucial trade-off between the LLM's factual grounding and its readability and responsiveness to clinical feedback. This work demonstrates the feasibility of combining wearable sensor analysis with Explainable AI (XAI) and contestable LLMs to create a transparent, auditable system for PD gait interpretation that maintains clinical oversight while leveraging advanced AI capabilities. Our implementation is publicly available at: https://github.com/hungdothanh/motion2meaning.

Motion2Meaning: A Clinician-Centered Framework for Contestable LLM in Parkinson's Disease Gait Interpretation

TL;DR

Motion2Meaning tackles the opacity of AI-driven gait interpretation in Parkinson's disease by introducing a clinician-centered, contestable AI framework. It integrates a 1D-CNN that predicts Hoehn & Yahr stages from vGRF data with a Cross-Modal Explanation Discrepancy (XMED) safeguard and a contestable LLM-based interaction layer in a Gait Data Visualization Interface and Contestable Interpretation Interface. The system achieves strong predictive performance (0.899 accuracy) while enabling clinicians to audit, challenge, and override AI decisions, aided by human-centered metrics for readability, grounding, and self-correction. The work demonstrates the feasibility of auditable, dialogic AI in PD care and outlines avenues for multi-modal fusion, clinical validation, and evaluation metric development to advance trustworthy AI in healthcare.

Abstract

AI-assisted gait analysis holds promise for improving Parkinson's Disease (PD) care, but current clinical dashboards lack transparency and offer no meaningful way for clinicians to interrogate or contest AI decisions. To address this issue, we present Motion2Meaning, a clinician-centered framework that advances Contestable AI through a tightly integrated interface designed for interpretability, oversight, and procedural recourse. Our approach leverages vertical Ground Reaction Force (vGRF) time-series data from wearable sensors as an objective biomarker of PD motor states. The system comprises three key components: a Gait Data Visualization Interface (GDVI), a one-dimensional Convolutional Neural Network (1D-CNN) that predicts Hoehn & Yahr severity stages, and a Contestable Interpretation Interface (CII) that combines our novel Cross-Modal Explanation Discrepancy (XMED) safeguard with a contestable Large Language Model (LLM). Our 1D-CNN achieves 89.0% F1-score on the public PhysioNet gait dataset. XMED successfully identifies model unreliability by detecting a five-fold increase in explanation discrepancies in incorrect predictions (7.45%) compared to correct ones (1.56%), while our LLM-powered interface enables clinicians to validate correct predictions and successfully contest a portion of the model's errors. A human-centered evaluation of this contestable interface reveals a crucial trade-off between the LLM's factual grounding and its readability and responsiveness to clinical feedback. This work demonstrates the feasibility of combining wearable sensor analysis with Explainable AI (XAI) and contestable LLMs to create a transparent, auditable system for PD gait interpretation that maintains clinical oversight while leveraging advanced AI capabilities. Our implementation is publicly available at: https://github.com/hungdothanh/motion2meaning.

Paper Structure

This paper contains 23 sections, 3 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Progression of human-centered XAI toward CAI.
  • Figure 2: The overview of Motion2Meaning framework: (a) Gait Data Visualization Interface (GDVI), and (b) Contestable Interpretation Interface (CII).
  • Figure 3: The dashboard overview of Gait Data Visualization Interface (GDVI).
  • Figure 4: Workflow overview of the XMED. The process compares CAM-based (Grad-CAM) and backpropagation-based (LRP) explanations to quantify model uncertainty. The input undergoes a forward pass to extract activations from the target convolutional layer. Grad-CAM computes weighted feature maps via gradient-based pooling, while LRP propagates relevance scores backward through the network. Both maps are normalized and compared to identify regions of high discrepancy, indicating divergent model explanations.
  • Figure 5: XMED Visualization for (a) correct and (b) incorrect prediction cases.
  • ...and 2 more figures