Force-Constrained Visual Policy: Safe Robot-Assisted Dressing via Multi-Modal Sensing

Zhanyi Sun; Yufei Wang; David Held; Zackory Erickson

Force-Constrained Visual Policy: Safe Robot-Assisted Dressing via Multi-Modal Sensing

Zhanyi Sun, Yufei Wang, David Held, Zackory Erickson

TL;DR

This work tackles safe robot-assisted dressing by marrying a vision-based policy trained in simulation with a real-world force dynamics model that constrains actions to keep interaction forces below a threshold $\tau$. The FCVP framework uses a constrained optimization with random shooting to select actions that both progress dressing and limit force, enabling safe contact with the human. Real-world experiments with 10 participants show FCVP achieves higher dressing success and substantially lower force violations than baselines, validating the practical impact of combining simulated vision with real-force modeling. The approach reduces sim2real transfer challenges in force sensing for deformable garments and offers a data-efficient path to safe multimodal dressing systems.

Abstract

Robot-assisted dressing could profoundly enhance the quality of life of adults with physical disabilities. To achieve this, a robot can benefit from both visual and force sensing. The former enables the robot to ascertain human body pose and garment deformations, while the latter helps maintain safety and comfort during the dressing process. In this paper, we introduce a new technique that leverages both vision and force modalities for this assistive task. Our approach first trains a vision-based dressing policy using reinforcement learning in simulation with varying body sizes, poses, and types of garments. We then learn a force dynamics model for action planning to ensure safety. Due to limitations of simulating accurate force data when deformable garments interact with the human body, we learn a force dynamics model directly from real-world data. Our proposed method combines the vision-based policy, trained in simulation, with the force dynamics model, learned in the real world, by solving a constrained optimization problem to infer actions that facilitate the dressing process without applying excessive force on the person. We evaluate our system in simulation and in a real-world human study with 10 participants across 240 dressing trials, showing it greatly outperforms prior baselines. Video demonstrations are available on our project website (https://sites.google.com/view/dressing-fcvp).

Force-Constrained Visual Policy: Safe Robot-Assisted Dressing via Multi-Modal Sensing

TL;DR

. The FCVP framework uses a constrained optimization with random shooting to select actions that both progress dressing and limit force, enabling safe contact with the human. Real-world experiments with 10 participants show FCVP achieves higher dressing success and substantially lower force violations than baselines, validating the practical impact of combining simulated vision with real-force modeling. The approach reduces sim2real transfer challenges in force sensing for deformable garments and offers a data-efficient path to safe multimodal dressing systems.

Abstract

Paper Structure (12 sections, 1 equation, 5 figures, 1 table)

This paper contains 12 sections, 1 equation, 5 figures, 1 table.

Introduction
Related Work
Problem Statement and Assumptions
Background - Vision-based Policy Learning in simulation
Method
Force dynamics model learning in the real world
Force-Constrained Vision Policy
Experiments
Sim2sim Transfer Experiments
Real-World Human Study
Limitations
Conclusion

Figures (5)

Figure 1: Our method learns a force dynamics model in the real world to constrain the vision-based policy trained in simulation (right), preventing high force from being applied to the person (left).
Figure 2: Our system combines a vision-based policy and a force dynamics model to achieve safe robot-assisted dressing. As most simulators provide sufficiently accurate simulation of point clouds yet not the force modality for sim2real transfer, the vision-based policy is trained with a large amount of data in simulation, and the force dynamics model is trained with a small amount of data in the real world. At test time, the vision-based policy proposes action samples that progress the dressing task. The force dynamics model predicts the future forces of these sampled actions, and the predictions are used to filter actions that are unsafe, i.e., those applying too much force to the human. The final chosen action is safe with low force and achieves the task.
Figure 3: (Left) Among all compared methods, FCVP achieves the best trade-off between the arm dressed ratio and the force violation amount. (Right) The detailed quantitative results for each method, as well as the number of training trajectories required for convergence in sim B.
Figure 4: Left: Human study setup. Right: Poses and Garments that we test in the human study.
Figure 5: Left and middle: Density plot and box plot of the force distributions on all participants in the human study. The dashed black line represents the force threshold. Our method greatly reduces the force violation compared to the baselines. Right: Likert item responses from all 10 participants. FCVP achieves statistically significant differences from both baselines with higher reported scores for all 3 Likert items.

Force-Constrained Visual Policy: Safe Robot-Assisted Dressing via Multi-Modal Sensing

TL;DR

Abstract

Force-Constrained Visual Policy: Safe Robot-Assisted Dressing via Multi-Modal Sensing

Authors

TL;DR

Abstract

Table of Contents

Figures (5)