A Human-Sensitive Controller: Adapting to Human Ergonomics and Physical Constraints via Reinforcement Learning
Vitor Martins, Sara M. Cerqueira, Mercedes Balcells, Elazer R Edelman, Cristina P. Santos
TL;DR
This paper tackles WRMSDs by introducing a human-sensitive robotic controller that learns to minimize elbow pain and ergonomic risk while maintaining task efficiency in collaborative transport tasks. It compares Q-Learning and a Deep Q-Network (DQN) within a two-stage learning framework: pre-training in a simulation with a scaled 2-DOF human model, followed by real-world fine-tuning using motion-capture feedback. The DQN, with a flexible action space and structured reward, achieves faster transport and lower pain risk, and demonstrates robust adaptability across participants with different anthropometries, albeit with a sim-to-real gap that fine-tuning helps to close. The work demonstrates the potential of RL-driven cobots to enable safer, more inclusive workplaces and outlines future steps toward broader WRMSD applications and biomarker-based pain assessment.
Abstract
Work-Related Musculoskeletal Disorders continue to be a major challenge in industrial environments, leading to reduced workforce participation, increased healthcare costs, and long-term disability. This study introduces a human-sensitive robotic system aimed at reintegrating individuals with a history of musculoskeletal disorders into standard job roles, while simultaneously optimizing ergonomic conditions for the broader workforce. This research leverages reinforcement learning to develop a human-aware control strategy for collaborative robots, focusing on optimizing ergonomic conditions and preventing pain during task execution. Two RL approaches, Q-Learning and Deep Q-Network (DQN), were implemented and tested to personalize control strategies based on individual user characteristics. Although experimental results revealed a simulation-to-real gap, a fine-tuning phase successfully adapted the policies to real-world conditions. DQN outperformed Q-Learning by completing tasks faster while maintaining zero pain risk and safe ergonomic levels. The structured testing protocol confirmed the system's adaptability to diverse human anthropometries, underscoring the potential of RL-driven cobots to enable safer, more inclusive workplaces.
