Table of Contents
Fetching ...

Clinician-Directed Large Language Model Software Generation for Therapeutic Interventions in Physical Rehabilitation

Edward Kim, Yuri Cho, Jose Eduardo E. Lima, Julie Muccini, Jenelle Jindal, Alison Scheid, Erik Nelson, Seong Hyun Park, Yuchen Zeng, Alton Sturgis, Caesar Li, Jackie Dai, Sun Min Kim, Yash Prakash, Liwen Sun, Isabella Hu, Hongxuan Wu, Daniel He, Wiktor Rajca, Cathra Halabi, Maarten Lansberg, Bjoern Hartmann, Sanjit A. Seshia

TL;DR

This study tackles the personalization bottleneck in home-based rehabilitation DHIs by enabling clinicians to craft patient-specific prescriptions and rely on an LLM to produce executable software that implements them. In a prospective single-arm feasibility trial with 20 therapists and a standardized patient, the LLM-generated software translated all prescriptions (40) into runnable programs, achieving 99.78% instruction delivery accuracy and 88.4% monitoring accuracy, while 90% of therapists deemed the approach safe and 75% willing to adopt it. The results show a substantial increase in deliverable personalization (100% vs 55% with a generalized template) and strong usability, though monitoring reliability and prompt refinement emerge as critical areas for real-patient trials and broader deployment. Overall, the work demonstrates feasibility and acceptability of clinician-directed LLM software generation for physical rehabilitation and motivates larger trials to assess clinical effectiveness and safety in real-world populations.

Abstract

Digital health interventions increasingly deliver home exercise programs via sensor-equipped devices such as smartphones, enabling remote monitoring of adherence and performance. However, current software is usually authored before clinical encounters as libraries of modules for broad impairment categories. At the point of care, clinicians can only choose from these modules and adjust a few parameters (for example, duration or repetitions). As a result, individual limitations, goals, and environmental constraints are often not reflected, limiting personalization and benefit. We propose a paradigm in which large language models (LLMs) act as constrained translators that convert clinicians' exercise prescriptions into intervention software. Clinicians remain the decision makers: they design exercises during the encounter, tailored to each patient's impairments, goals, and environment, and the LLM generates matching software. We conducted a prospective single-arm feasibility study with 20 licensed physical and occupational therapists who created 40 individualized upper extremity programs for a standardized patient; 100% of prescriptions were translated into executable software, compared with 55% under a representative template-based digital health intervention (p < 0.01). LLM-generated software correctly delivered 99.7% of instructions and monitored performance with 88.4% accuracy (95% confidence interval, 0.843-0.915). Overall, 90% of therapists judged the system safe for patient interaction and 75% expressed willingness to adopt it in practice. To our knowledge, this is the first prospective evaluation of clinician-directed intervention software generation with an LLM in health care, demonstrating feasibility and motivating larger trials in real patient populations.

Clinician-Directed Large Language Model Software Generation for Therapeutic Interventions in Physical Rehabilitation

TL;DR

This study tackles the personalization bottleneck in home-based rehabilitation DHIs by enabling clinicians to craft patient-specific prescriptions and rely on an LLM to produce executable software that implements them. In a prospective single-arm feasibility trial with 20 therapists and a standardized patient, the LLM-generated software translated all prescriptions (40) into runnable programs, achieving 99.78% instruction delivery accuracy and 88.4% monitoring accuracy, while 90% of therapists deemed the approach safe and 75% willing to adopt it. The results show a substantial increase in deliverable personalization (100% vs 55% with a generalized template) and strong usability, though monitoring reliability and prompt refinement emerge as critical areas for real-patient trials and broader deployment. Overall, the work demonstrates feasibility and acceptability of clinician-directed LLM software generation for physical rehabilitation and motivates larger trials to assess clinical effectiveness and safety in real-world populations.

Abstract

Digital health interventions increasingly deliver home exercise programs via sensor-equipped devices such as smartphones, enabling remote monitoring of adherence and performance. However, current software is usually authored before clinical encounters as libraries of modules for broad impairment categories. At the point of care, clinicians can only choose from these modules and adjust a few parameters (for example, duration or repetitions). As a result, individual limitations, goals, and environmental constraints are often not reflected, limiting personalization and benefit. We propose a paradigm in which large language models (LLMs) act as constrained translators that convert clinicians' exercise prescriptions into intervention software. Clinicians remain the decision makers: they design exercises during the encounter, tailored to each patient's impairments, goals, and environment, and the LLM generates matching software. We conducted a prospective single-arm feasibility study with 20 licensed physical and occupational therapists who created 40 individualized upper extremity programs for a standardized patient; 100% of prescriptions were translated into executable software, compared with 55% under a representative template-based digital health intervention (p < 0.01). LLM-generated software correctly delivered 99.7% of instructions and monitored performance with 88.4% accuracy (95% confidence interval, 0.843-0.915). Overall, 90% of therapists judged the system safe for patient interaction and 75% expressed willingness to adopt it in practice. To our knowledge, this is the first prospective evaluation of clinician-directed intervention software generation with an LLM in health care, demonstrating feasibility and motivating larger trials in real patient populations.

Paper Structure

This paper contains 26 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Overview of the clinician-directed intervention software generation via LLM and evaluation framework. Each therapist conducted a therapy session with a standardized patient and designed two tailored exercises, which were translated by the LLM into intervention software. The software was deployed to the standardized patient’s electronic device to instruct the exercises and monitor the patient’s movements, while the therapist observed remotely. Upon completion, the software reported on monitored outcomes back to the therapist. Therapists evaluated the LLM-based digital prescription paradigm across five dimensions: (i) usability of intervention software generation via LLM, (ii) flexibility in accommodating personalized exercise design, (iii) instruction and monitoring accuracy of generated software, (iv) perceived safety in patient use, and (v) clinician acceptance of LLM software generation.
  • Figure 2: The monitoring accuracy, sensitivity, and specificity of the LLM-generated intervention software with 95% confidence intervals (CI) were computed using a generalized linear mixed model (GLMM) and a Wilson score test.
  • Figure 3: Responses to the 5-point Likert scale questions from the user experience interview on usability, safety, and clinician acceptance of LLM software generation. Vertical red dotted lines indicate the boundary between scores of 4 ("Agree") and below. In the survey statements, the term "system" referred to the LLM software generation for ease of understanding by therapists.
  • Figure 4: Webportal for clinicians to evaluate monitoring accuracy. On the left, the interface displays a pre-labeled table of completion outcomes, with therapists marking each instruction as success (complete) or failure (incomplete), highlighted in green. On the right, a video panel shows the standardized patient conducting the exercises while following the software’s instructions.