Rehabilitation Exercise Quality Assessment and Feedback Generation Using Large Language Models with Prompt Engineering
Jessica Tang, Ali Abedi, Tracey J. F. Colella, Shehroz S. Khan
TL;DR
This work addresses the challenge of delivering actionable rehabilitation feedback in home-based settings by leveraging pre-trained large language models (LLMs) guided through carefully designed prompts. It introduces a framework that fuses exercise-specific joint features with zero- to few-shot prompting, reasoning elicitation (Chain-of-Thought, certainty, probability), and role-play prompts to enable GPT-4o to assess movement quality and generate natural-language feedback. Experiments on UI-PRMD and REHAB24-6 show that feature-based prompts and three-shot configurations yield strong classification performance, with reasoning-enabled prompts further improving interpretability, though LLM overconfidence remains an issue. The study demonstrates the practical potential of LLM-driven feedback within virtual rehabilitation platforms, while acknowledging limitations such as the lack of ground-truth textual feedback datasets and reproducibility challenges, and outlining directions for data collection and potential model fine-tuning to enhance reliability and applicability.
Abstract
Exercise-based rehabilitation improves quality of life and reduces morbidity, mortality, and rehospitalization, though transportation constraints and staff shortages lead to high dropout rates from rehabilitation programs. Virtual platforms enable patients to complete prescribed exercises at home, while AI algorithms analyze performance, deliver feedback, and update clinicians. Although many studies have developed machine learning and deep learning models for exercise quality assessment, few have explored the use of large language models (LLMs) for feedback and are limited by the lack of rehabilitation datasets containing textual feedback. In this paper, we propose a new method in which exercise-specific features are extracted from the skeletal joints of patients performing rehabilitation exercises and fed into pre-trained LLMs. Using a range of prompting techniques, such as zero-shot, few-shot, chain-of-thought, and role-play prompting, LLMs are leveraged to evaluate exercise quality and provide feedback in natural language to help patients improve their movements. The method was evaluated through extensive experiments on two publicly available rehabilitation exercise assessment datasets (UI-PRMD and REHAB24-6) and showed promising results in exercise assessment, reasoning, and feedback generation. This approach can be integrated into virtual rehabilitation platforms to help patients perform exercises correctly, support recovery, and improve health outcomes.
