Table of Contents
Fetching ...

A Unified Framework for Robots that Influence Humans over Long-Term Interaction

Shahabedin Sagheb, Sagar Parekh, Ravi Pandya, Ye-Ji Mun, Katherine Driggs-Campbell, Andrea Bajcsy, Dylan P. Losey

TL;DR

The paper addresses the challenge that robots must influence humans not only in the short term but over repeated interactions, as humans adapt their behavior. It proposes a unifying control-theoretic framework that models the human partner as history-aware with short- and long-term latent dynamics, formulating the problem as a mixed-observability Markov decision process. The authors show that existing game-theoretic and latent-representation approaches are special cases of this framework, and demonstrate through simulations and user studies that the unified approach yields more reliable long-term influence and task performance. They also provide tractable approximations (e.g., One-Step) and validate the method in aerial-drone and driving scenarios, highlighting practical benefits for safe and efficient human-robot coordination. The work advances embodied AI by offering principled, scalable means to regulate robot influence in dynamic, long-term human-robot interactions.

Abstract

Robot actions influence the decisions of nearby humans. Here influence refers to intentional change: robots influence humans when they shift the human's behavior in a way that helps the robot complete its task. Imagine an autonomous car trying to merge; by proactively nudging into the human's lane, the robot causes human drivers to yield and provide space. Influence is often necessary for seamless interaction. However, if influence is left unregulated and uncontrolled, robots will negatively impact the humans around them. Prior works have begun to address this problem by creating a variety of control algorithms that seek to influence humans. Although these methods are effective in the short-term, they fail to maintain influence over time as the human adapts to the robot's behaviors. In this paper we therefore present an optimization framework that enables robots to purposely regulate their influence over humans across both short-term and long-term interactions. Here the robot maintains its influence by reasoning over a dynamic human model which captures how the robot's current choices will impact the human's future behavior. Our resulting framework serves to unify current approaches: we demonstrate that state-of-the-art methods are simplifications of our underlying formalism. Our framework also provides a principled way to generate influential policies: in the best case the robot exactly solves our framework to find optimal, influential behavior. But when solving this optimization problem becomes impractical, designers can introduce their own simplifications to reach tractable approximations. We experimentally compare our unified framework to state-of-the-art baselines and ablations, and demonstrate across simulations and user studies that this framework is able to successfully influence humans over repeated interactions. See videos of our experiments here: https://youtu.be/nPekTUfUEbo

A Unified Framework for Robots that Influence Humans over Long-Term Interaction

TL;DR

The paper addresses the challenge that robots must influence humans not only in the short term but over repeated interactions, as humans adapt their behavior. It proposes a unifying control-theoretic framework that models the human partner as history-aware with short- and long-term latent dynamics, formulating the problem as a mixed-observability Markov decision process. The authors show that existing game-theoretic and latent-representation approaches are special cases of this framework, and demonstrate through simulations and user studies that the unified approach yields more reliable long-term influence and task performance. They also provide tractable approximations (e.g., One-Step) and validate the method in aerial-drone and driving scenarios, highlighting practical benefits for safe and efficient human-robot coordination. The work advances embodied AI by offering principled, scalable means to regulate robot influence in dynamic, long-term human-robot interactions.

Abstract

Robot actions influence the decisions of nearby humans. Here influence refers to intentional change: robots influence humans when they shift the human's behavior in a way that helps the robot complete its task. Imagine an autonomous car trying to merge; by proactively nudging into the human's lane, the robot causes human drivers to yield and provide space. Influence is often necessary for seamless interaction. However, if influence is left unregulated and uncontrolled, robots will negatively impact the humans around them. Prior works have begun to address this problem by creating a variety of control algorithms that seek to influence humans. Although these methods are effective in the short-term, they fail to maintain influence over time as the human adapts to the robot's behaviors. In this paper we therefore present an optimization framework that enables robots to purposely regulate their influence over humans across both short-term and long-term interactions. Here the robot maintains its influence by reasoning over a dynamic human model which captures how the robot's current choices will impact the human's future behavior. Our resulting framework serves to unify current approaches: we demonstrate that state-of-the-art methods are simplifications of our underlying formalism. Our framework also provides a principled way to generate influential policies: in the best case the robot exactly solves our framework to find optimal, influential behavior. But when solving this optimization problem becomes impractical, designers can introduce their own simplifications to reach tractable approximations. We experimentally compare our unified framework to state-of-the-art baselines and ablations, and demonstrate across simulations and user studies that this framework is able to successfully influence humans over repeated interactions. See videos of our experiments here: https://youtu.be/nPekTUfUEbo

Paper Structure

This paper contains 18 sections, 15 equations, 9 figures.

Figures (9)

  • Figure 1: (Left) Human interacting with a drone, a robot arm, and an autonomous car. In each scenario, the robot uses state-of-the-art algorithms to influence the human and change their actions. (Right) Results from online and in-person user studies. The state-of-the-art approaches work in the short-term, i.e., human behavior is influenced by the robot in the first few interactions, but not in the long-term, i.e., over time the participants adapt to avoid or ignore the influential robot.
  • Figure 2: Exploring how the existing approaches from Section \ref{['sec:E1']} influence humans over short- and long-term interactions. Participants repeatedly interact with an autonomous car that uses Equation (\ref{['eq:E2']}) and the static human model from Equation (\ref{['eq:E3']}) to influence their behavior. The autonomous car selects actions $\mathbf{a}_{\mathcal{R}}$ by treating each interaction as an independent Stackelberg game; this is consistent with prior works sadigh2016planningfisac2019hierarchicaltian2022safetyschwarting2019social. The robot is rewarded for influencing the human to slow down, yield, and reduce lane progress. For both online (Top) and in-person (Bottom) participants, the robot's influence decreases over time (i.e., human yields less or makes more lane progress). In the last column (Right) we display the average behavior across all three environments. Shaded regions show standard error.
  • Figure 3: Our unified framework for influence. When interacting with humans, the robot models the human's short-term as well as long-term dynamics. During interaction $1$, the robot merges into the human's lane to influence them. Upon observing the human slow down, the robot models their short term dynamics and infers that the human is a defensive driver $z^1$. However, over $k$ repeated interactions the human might change their response strategy and start driving aggressively according to their set of rules $\phi^k$. Modeling the human's long-term dynamics enables the robot to anticipate the change in the human's response strategy. This enables the robot to optimize for a policy which can influence the human over long-term interaction. During interaction $k + n$, the robot does not merge into the human's lane, anticipating how the human will change lanes to try and avoid the autonomous car.
  • Figure 4: Experiment from Section \ref{['sec:sim1']}. (Left) Simulated humans drive alongside robots using Unified and Stackelberg methods. The robot car takes actions to slow down the human without crashing. We show examples of the trajectories that the cars followed initially (i.e., first interactions) vs later on (i.e., final interactions). (Right) We plot average lane progress and total number of collisions per interaction. A higher lane progress indicates that the robot is unable to successfully influence the human to slow down. The shaded regions and error bars indicate standard error. We compare robot reward averaged across all interactions for each method. Fewer collisions and a higher reward are indicative of better robot performance. Asterisks $*$ denote statistical significance ($p < .05$).
  • Figure 5: Example interactions from Section \ref{['sec:sim2']} between the robot and simulated human in the Circle (Top), Driving (Middle), and Robot (Bottom) environments. The first row for each environment shows interactions with a robot using the Latent method, while the second row shows interactions with a robot using the Unified method. In Circle, Unified is able to successfully influence the human by trapping them at the top of the circle while Latent is not. In Driving, Latent leads to higher number of collisions since it is not able to influence the human to stay out of the robot's lane. Finally, in the Robot environment while Latent cycles through each of the three goals, Unified influences the human to frequently pick the robot's preferred goal (shown with a star).
  • ...and 4 more figures