On the Dual-Use Dilemma in Physical Reasoning and Force
William Xie, Enora Rice, Nikolaus Correll
TL;DR
This work tackles the dual-use dilemma of enabling physical reasoning and force in vision-language models (VLMs) used for robot control. It uses two case studies to evaluate Asimovian safeguarding prompts on wrench planning and grasp-force control across multiple models and tasks, revealing a persistent trade-off between safety and capability. Key findings show safeguarding can significantly reduce both harmful and helpful outputs (e.g., harmful elicitation from $53\%$ to $19\%$, helpful from $50\%$ to $38\%$ in wrench scenarios; grasp-harm from $67\%$ to $1.7\%$, helpful from $91\%$ to $45\%$), with substantial model-dependent variations and a mean harm-detection rate around $71\%$. The authors argue for human-centered evaluation and development to balance safety with practical capability in robot learning, and advocate approaches that mitigate dual-use without stifling progress in contact-rich manipulation, with implications for real-world applications such as elderly-care robotics.
Abstract
Humans learn how and when to apply forces in the world via a complex physiological and psychological learning process. Attempting to replicate this in vision-language models (VLMs) presents two challenges: VLMs can produce harmful behavior, which is particularly dangerous for VLM-controlled robots which interact with the world, but imposing behavioral safeguards can limit their functional and ethical extents. We conduct two case studies on safeguarding VLMs which generate forceful robotic motion, finding that safeguards reduce both harmful and helpful behavior involving contact-rich manipulation of human body parts. Then, we discuss the key implication of this result--that value alignment may impede desirable robot capabilities--for model evaluation and robot learning.
