Table of Contents
Fetching ...

Low-Burden LLM-Based Preference Learning: Personalizing Assistive Robots from Natural Language Feedback for Users with Paralysis

Keshav Shankar, Dan Ding, Wei Gao

Abstract

Physically Assistive Robots (PARs) require personalized behaviors to ensure user safety and comfort. However, traditional preference learning methods, like exhaustive pairwise comparisons, cause severe physical and cognitive fatigue for users with profound motor impairments. To solve this, we propose a low-burden, offline framework that translates unstructured natural language feedback directly into deterministic robotic control policies. To safely bridge the gap between ambiguous human speech and robotic code, our pipeline uses Large Language Models (LLMs) grounded in the Occupational Therapy Practice Framework (OTPF). This clinical reasoning decodes subjective user reactions into explicit physical and psychological needs, which are then mapped into transparent decision trees. Before deployment, an automated "LLM-as-a-Judge" verifies the code's structural safety. We validated this system in a simulated meal preparation study with 10 adults with paralysis. Results show our natural language approach significantly reduces user workload compared to traditional baselines. Additionally, independent clinical experts confirmed the generated policies are safe and accurately reflect user preferences.

Low-Burden LLM-Based Preference Learning: Personalizing Assistive Robots from Natural Language Feedback for Users with Paralysis

Abstract

Physically Assistive Robots (PARs) require personalized behaviors to ensure user safety and comfort. However, traditional preference learning methods, like exhaustive pairwise comparisons, cause severe physical and cognitive fatigue for users with profound motor impairments. To solve this, we propose a low-burden, offline framework that translates unstructured natural language feedback directly into deterministic robotic control policies. To safely bridge the gap between ambiguous human speech and robotic code, our pipeline uses Large Language Models (LLMs) grounded in the Occupational Therapy Practice Framework (OTPF). This clinical reasoning decodes subjective user reactions into explicit physical and psychological needs, which are then mapped into transparent decision trees. Before deployment, an automated "LLM-as-a-Judge" verifies the code's structural safety. We validated this system in a simulated meal preparation study with 10 adults with paralysis. Results show our natural language approach significantly reduces user workload compared to traditional baselines. Additionally, independent clinical experts confirmed the generated policies are safe and accurately reflect user preferences.

Paper Structure

This paper contains 26 sections, 2 equations, 9 figures, 1 table.

Figures (9)

  • Figure A1: High-burden pairwise comparisons (left) versus our proposed low-burden natural language (right) preferences.
  • Figure A2: Proposed pipeline for translating unstructured natural language feedback into verifiable robotic policies.
  • Figure B1: Overview of the LLM prompt structures, including roles, inputs, instructions, and outputs for the pipeline.
  • Figure D1: Simulated kitchen environment with target objects, user, and robot.
  • Figure D2: Distribution of NASA-TLX subscale scores across the three elicitation methods: questionnaire (M1), pairwise comparison (M2), and natural language feedback (M3). Solid bars represent mean scores, error bars denote the standard error of the mean, and black dots indicate individual participant scores. Lower scores indicate lower perceived user burden. Scoring for the Performance subscale is inverted.
  • ...and 4 more figures