Imagining In-distribution States: How Predictable Robot Behavior Can Enable User Control Over Learned Policies

Isaac Sheidlower; Emma Bethel; Douglas Lilly; Reuben M. Aronson; Elaine Schaertl Short

Imagining In-distribution States: How Predictable Robot Behavior Can Enable User Control Over Learned Policies

Isaac Sheidlower, Emma Bethel, Douglas Lilly, Reuben M. Aronson, Elaine Schaertl Short

TL;DR

This paper tackles enabling user-driven collaboration with RL robots when only partial control is available via Partitioned Control (PC). It introduces Imaginary Out-of-Distribution Actions (IODA), which uses an OOD detector and state projection to imagine the current state as a familiar one from the user’s experience, ensuring the policy acts in-distribution under PC. Formal problem definitions, simulation validation, and a real-robot user study (n=18) show that IODA improves task performance and aligns robot behavior with user expectations, revealing a strong correlation between expectation alignment and success in PC. The work advances human-robot collaboration by making learned policies more predictable and facilitates more creative, reliable use of robot autonomy in novel tasks.

Abstract

It is crucial that users are empowered to take advantage of the functionality of a robot and use their understanding of that functionality to perform novel and creative tasks. Given a robot trained with Reinforcement Learning (RL), a user may wish to leverage that autonomy along with their familiarity of how they expect the robot to behave to collaborate with the robot. One technique is for the user to take control of some of the robot's action space through teleoperation, allowing the RL policy to simultaneously control the rest. We formalize this type of shared control as Partitioned Control (PC). However, this may not be possible using an out-of-the-box RL policy. For example, a user's control may bring the robot into a failure state from the policy's perspective, causing it to act unexpectedly and hindering the success of the user's desired task. In this work, we formalize this problem and present Imaginary Out-of-Distribution Actions, IODA, an initial algorithm which empowers users to leverage their expectations of a robot's behavior to accomplish new tasks. We deploy IODA in a user study with a real robot and find that IODA leads to both better task performance and a higher degree of alignment between robot behavior and user expectation. We also show that in PC, there is a strong and significant correlation between task performance and the robot's ability to meet user expectations, highlighting the need for approaches like IODA. Code is available at https://github.com/AABL-Lab/ioda_roman_2024

Imagining In-distribution States: How Predictable Robot Behavior Can Enable User Control Over Learned Policies

TL;DR

Abstract

Paper Structure (11 sections, 2 equations, 6 figures, 1 algorithm)

This paper contains 11 sections, 2 equations, 6 figures, 1 algorithm.

Introduction
Related Work
Problem Setting
Imaginary Out-of-Distribution Actions (IODA)
Simulation Example
Methodology: User Study
Conditions
Experimental procedure
Results
Discussion
Conclusion

Figures (6)

Figure 1: A depiction of the "flower watering" task setup used to study Partitioned Control and IODA with novice-users.
Figure 2: In a 2D goal navigation task, a simulated user is trying to leverage an optimal policy to reach subgoals by controlling the x-axis of the robot whilst the policy controls the y-axis. These subgoals are outside the robot's original workspace (highlighted in gray). Our algorithm IODA allows the user to seamlessly reach the subgoals.
Figure 3: User reported expectation alignment and degree of surprise for each condition.
Figure 4: Top: IODA performed the best in the watering task with the least error. Bottom: Mean and standard-deviation for time-on-task for each condition
Figure 5: Trajectories of the cup for all 18 participants. The redder the line indicates how long the cup was stopped at that point. The reddest point indicates that the cup is stopped for at least 7.5 seconds
...and 1 more figures

Imagining In-distribution States: How Predictable Robot Behavior Can Enable User Control Over Learned Policies

TL;DR

Abstract

Imagining In-distribution States: How Predictable Robot Behavior Can Enable User Control Over Learned Policies

Authors

TL;DR

Abstract

Table of Contents

Figures (6)