Learning to Trust: Bayesian Adaptation to Varying Suggester Reliability in Sequential Decision Making
Dylan M. Asmar, Mykel J. Kochenderfer
TL;DR
The paper addresses robust integration of external guidance whose reliability varies over time in sequential decision-making under uncertainty. It develops a unified $POMDP$/$MOMDP$ framework that treats suggester reliability as a hidden state represented by a finite type set $\mathcal{T}$ and inferred via Bayesian updates over $(s,\hat{\lambda})$, enabling adaptive trust calibration. It further introduces an explicit information-gathering action $a_{\text{ask}}$, bootstraps the suggestion model from $Q$-values through $p(\sigma|s,\hat{\lambda}) \propto \exp(\hat{\lambda} Q(s,\sigma))$, and constrains queries via a cost and budget. Experiments in Tag and RockSample domains show that agents with multi-type beliefs adapt to dynamic reliabilities and that proactive asking can improve decision quality under cost constraints, including when using heuristic suggesters. Overall, the work advances adaptive human–agent collaboration by enabling trust-aware planning and strategic information acquisition in uncertain environments.
Abstract
Autonomous agents operating in sequential decision-making tasks under uncertainty can benefit from external action suggestions, which provide valuable guidance but inherently vary in reliability. Existing methods for incorporating such advice typically assume static and known suggester quality parameters, limiting practical deployment. We introduce a framework that dynamically learns and adapts to varying suggester reliability in partially observable environments. First, we integrate suggester quality directly into the agent's belief representation, enabling agents to infer and adjust their reliance on suggestions through Bayesian inference over suggester types. Second, we introduce an explicit ``ask'' action allowing agents to strategically request suggestions at critical moments, balancing informational gains against acquisition costs. Experimental evaluation demonstrates robust performance across varying suggester qualities, adaptation to changing reliability, and strategic management of suggestion requests. This work provides a foundation for adaptive human-agent collaboration by addressing suggestion uncertainty in uncertain environments.
