Intelligent support for Human Oversight: Integrating Reinforcement Learning with Gaze Simulation to Personalize Highlighting
Thorsten Klößner, João Belo, Zekun Wu, Jörg Hoffmann, Anna Maria Feit
TL;DR
The paper tackles how to support human oversight under time pressure by learning adaptive highlighting policies through reinforcement learning guided by models of user gaze. It formulates the monitoring interface and user attention as an MDP and trains a PPO agent in a simulated environment that balances alert benefits against cognitive costs via a gaze-driven state transition. By integrating a temporal saliency model (based on a fine-tuned TASED-Net) to predict gaze, the approach enables offline policy learning for a multi-drone oversight scenario ($N=4$, $|Attr|=8$) without real-world deployment. Preliminary qualitative results indicate that RL-based highlighting can outperform static rule-based highlighting, but substantial challenges remain in gaze-model fidelity, reward design, and empirical validation with real users.
Abstract
Interfaces for human oversight must effectively support users' situation awareness under time-critical conditions. We explore reinforcement learning (RL)-based UI adaptation to personalize alerting strategies that balance the benefits of highlighting critical events against the cognitive costs of interruptions. To enable learning without real-world deployment, we integrate models of users' gaze behavior to simulate attentional dynamics during monitoring. Using a delivery-drone oversight scenario, we present initial results suggesting that RL-based highlighting can outperform static, rule-based approaches and discuss challenges of intelligent oversight support.
