Table of Contents
Fetching ...

MR-STGN: Multi-Residual Spatio Temporal Graph Network Using Attention Fusion for Patient Action Assessment

Youssef Mourchid, Rim Slama

TL;DR

This work introduces MR-STGN, a multi-residual spatio-temporal graph network that fuses angular and positional 3D skeletons to assess patient rehabilitation actions in real time. The model employs dual spatial–temporal streams, graph convolutions with a ConvGRU-based dynamic adjacency, and an attention fusion mechanism to adaptively weight features from both skeleton representations. Ablation studies and evaluation on the UI-PRMD dataset show that integrating angular and positional cues with attention fusion yields superior action scoring (lower MAE) and provides actionable feedback on which joints to focus on. The approach offers real-time clinical utility by delivering precise, interpretable guidance and has potential for continuous monitoring and user-friendly feedback interfaces in rehabilitation settings.

Abstract

Accurate assessment of patient actions plays a crucial role in healthcare as it contributes significantly to disease progression monitoring and treatment effectiveness. However, traditional approaches to assess patient actions often rely on manual observation and scoring, which are subjective and time-consuming. In this paper, we propose an automated approach for patient action assessment using a Multi-Residual Spatio Temporal Graph Network (MR-STGN) that incorporates both angular and positional 3D skeletons. The MR-STGN is specifically designed to capture the spatio-temporal dynamics of patient actions. It achieves this by integrating information from multiple residual layers, with each layer extracting features at distinct levels of abstraction. Furthermore, we integrate an attention fusion mechanism into the network, which facilitates the adaptive weighting of various features. This empowers the model to concentrate on the most pertinent aspects of the patient's movements, offering precise instructions regarding specific body parts or movements that require attention. Ablation studies are conducted to analyze the impact of individual components within the proposed model. We evaluate our model on the UI-PRMD dataset demonstrating its performance in accurately predicting real-time patient action scores, surpassing state-of-the-art methods.

MR-STGN: Multi-Residual Spatio Temporal Graph Network Using Attention Fusion for Patient Action Assessment

TL;DR

This work introduces MR-STGN, a multi-residual spatio-temporal graph network that fuses angular and positional 3D skeletons to assess patient rehabilitation actions in real time. The model employs dual spatial–temporal streams, graph convolutions with a ConvGRU-based dynamic adjacency, and an attention fusion mechanism to adaptively weight features from both skeleton representations. Ablation studies and evaluation on the UI-PRMD dataset show that integrating angular and positional cues with attention fusion yields superior action scoring (lower MAE) and provides actionable feedback on which joints to focus on. The approach offers real-time clinical utility by delivering precise, interpretable guidance and has potential for continuous monitoring and user-friendly feedback interfaces in rehabilitation settings.

Abstract

Accurate assessment of patient actions plays a crucial role in healthcare as it contributes significantly to disease progression monitoring and treatment effectiveness. However, traditional approaches to assess patient actions often rely on manual observation and scoring, which are subjective and time-consuming. In this paper, we propose an automated approach for patient action assessment using a Multi-Residual Spatio Temporal Graph Network (MR-STGN) that incorporates both angular and positional 3D skeletons. The MR-STGN is specifically designed to capture the spatio-temporal dynamics of patient actions. It achieves this by integrating information from multiple residual layers, with each layer extracting features at distinct levels of abstraction. Furthermore, we integrate an attention fusion mechanism into the network, which facilitates the adaptive weighting of various features. This empowers the model to concentrate on the most pertinent aspects of the patient's movements, offering precise instructions regarding specific body parts or movements that require attention. Ablation studies are conducted to analyze the impact of individual components within the proposed model. We evaluate our model on the UI-PRMD dataset demonstrating its performance in accurately predicting real-time patient action scores, surpassing state-of-the-art methods.
Paper Structure (10 sections, 10 equations, 4 figures, 2 tables)

This paper contains 10 sections, 10 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Physical rehabilitation exercises process overview.
  • Figure 2: Flowchart of the proposed approach.
  • Figure 3: Feedback visualization for different user profiles on UI-PRMD dataset: expert, not an expert with a good score, and not an expert with a low score. The attention Vector denotes the joint role vector (hot colors represent high values). Colored circles on the skeleton bodies allow the visualization of the attention vector and the role of body joints for different exercises.
  • Figure 4: Joints in the skeletal model of Kinect-recorded data from UI-PRMD dataset.