Guessing human intentions to avoid dangerous situations in caregiving robots
Noé Zapata, Gerardo Pérez, Lucas Bonilla, Pedro Núñez, Pilar Bachiller, Pablo Bustos
TL;DR
The paper tackles the problem of enabling caregiving robots to infer human intentions to avoid dangerous situations. It advances a simulation-based Artificial Theory of Mind (ATM) framework implementing a like-me policy within the CORTEX cognitive architecture, using an internal physics-based simulator to forecast outcomes and test interventions in real time. Two ATM agents perform intention guessing/enactment and action selection, operating on a shared working memory $ \mathcal{W}$ that fuses symbolic and numeric data. Across Webots simulations, human-in-the-loop tests, and a real-world Shadow robot trial, the approach achieves high recall (no missed dangerous intentions) with competitive accuracy (around 79.64%), and sub-second reaction times, demonstrating real-time, safety-oriented intervention in social robot contexts.
Abstract
For robots to interact socially, they must interpret human intentions and anticipate their potential outcomes accurately. This is particularly important for social robots designed for human care, which may face potentially dangerous situations for people, such as unseen obstacles in their way, that should be avoided. This paper explores the Artificial Theory of Mind (ATM) approach to inferring and interpreting human intentions. We propose an algorithm that detects risky situations for humans, selecting a robot action that removes the danger in real time. We use the simulation-based approach to ATM and adopt the 'like-me' policy to assign intentions and actions to people. Using this strategy, the robot can detect and act with a high rate of success under time-constrained situations. The algorithm has been implemented as part of an existing robotics cognitive architecture and tested in simulation scenarios. Three experiments have been conducted to test the implementation's robustness, precision and real-time response, including a simulated scenario, a human-in-the-loop hybrid configuration and a real-world scenario.
