Task load dependent decision referrals for joint binary classification in human-automation teams
Kesav Kaza, Jerome Le Ny, Aditya Mahajan
TL;DR
This work addresses optimal task referrals in human-automation teams performing binary classification by modeling the human operator's performance as a function of task load and deriving a referral policy that leverages observed automation data. The authors introduce a referral index $R(p,w)$ and prove that, for a given workload $w$, referring the top-$w$ tasks by this index minimizes the total expected cost; they then search over feasible loads to obtain the overall policy. The framework is validated through simulations with Gaussian observation models and an experimental study using a radar-like task, showing that the proposed optimal allocation (OA) policy outperforms a blind allocation (BA) baseline and performs on par with a static allocation (SA) when load variability is a concern. The results highlight the practical viability of load-aware task referrals, with the ability to estimate human performance functions from calibration experiments, enabling real-world deployment in joint human-automation decision systems.
Abstract
We consider the problem of optimal decision referrals in human-automation teams performing binary classification tasks. The automation, which includes a pre-trained classifier, observes data for a batch of independent tasks, analyzes them, and may refer a subset of tasks to a human operator for fresh and final analysis. Our key modeling assumption is that human performance degrades with task load. We model the problem of choosing which tasks to refer as a stochastic optimization problem and show that, for a given task load, it is optimal to myopically refer tasks that yield the largest reduction in expected cost, conditional on the observed data. This provides a ranking scheme and a policy to determine the optimal set of tasks for referral. We evaluate this policy against a baseline through an experimental study with human participants. Using a radar screen simulator, participants made binary target classification decisions under time constraint. They were guided by a decision rule provided to them, but were still prone to errors under time pressure. An initial experiment estimated human performance model parameters, while a second experiment compared two referral policies. Results show statistically significant gains for the proposed optimal referral policy over a blind policy that determines referrals using the automation and human-performance models but not based on the observed data.
