Shared Control with Black Box Agents using Oracle Queries
Inbal Avraham, Reuth Mirsky
TL;DR
This work extends shared control by introducing an oracle-driven query channel between a cooperating agent and a learning control, formalized as a MA-MDP with two state spaces and a configurable operation protocol. It proposes three querying heuristics—Entropy, Utility, and Reinforcement Learning—to decide when to consult the oracle, aiming to reduce learning cost while maintaining or improving policy performance. Empirical evaluation across automata-based tasks and a Lunar Lander domain demonstrates that querying can substantially decrease the number of queries and accelerate learning, with trade-offs in accuracy and reliance on oracle type. The findings suggest practical benefits for faster, more reliable shared control, while highlighting the importance of oracle quality and the potential of adaptive querying strategies for real-world deployment.
Abstract
Shared control problems involve a robot learning to collaborate with a human. When learning a shared control policy, short communication between the agents can often significantly reduce running times and improve the system's accuracy. We extend the shared control problem to include the ability to directly query a cooperating agent. We consider two types of potential responses to a query, namely oracles: one that can provide the learner with the best action they should take, even when that action might be myopically wrong, and one with a bounded knowledge limited to its part of the system. Given this additional information channel, this work further presents three heuristics for choosing when to query: reinforcement learning-based, utility-based, and entropy-based. These heuristics aim to reduce a system's overall learning cost. Empirical results on two environments show the benefits of querying to learn a better control policy and the tradeoffs between the proposed heuristics.
