Confounding-Robust Policy Improvement with Human-AI Teams
Ruijiang Gao, Mingzhang Yin
TL;DR
The paper tackles unobserved confounding in offline policy learning for human-AI teams by introducing ConfHAI, a confounding-robust deferral framework that combines a marginal sensitivity model with a self-normalized Hájek estimator to optimize routing and treatment under worst-case bias. It supports personalization across multiple human decision-makers (ConfHAIPerson) and provides theoretical improvement guarantees, along with a scalable optimization procedure using line-search for the inner max and gradient-based updates for the outer problem. The approach is validated through synthetic experiments and real-world datasets (HELOC lending, IST stroke) as well as a dataset of real human responses (FOCUS), demonstrating robust policy improvements and effective deferral between humans and AI under varying confounding strengths $\Gamma$. The results highlight the practical value of explicitly modeling unobserved information differences in human-AI collaboration to achieve more reliable and improved outcomes.
Abstract
Human-AI collaboration has the potential to transform various domains by leveraging the complementary strengths of human experts and Artificial Intelligence (AI) systems. However, unobserved confounding can undermine the effectiveness of this collaboration, leading to biased and unreliable outcomes. In this paper, we propose a novel solution to address unobserved confounding in human-AI collaboration by employing sensitivity analysis from causal inference. Our approach combines domain expertise with AI-driven statistical modeling to account for potentially hidden confounders. We present a deferral collaboration framework for incorporating the sensitivity model into offline policy learning, enabling the system to control for the influence of unobserved confounding factors. In addition, we propose a personalized deferral collaboration system to leverage the diverse expertise of different human decision-makers. By adjusting for potential biases, our proposed solution enhances the robustness and reliability of collaborative outcomes. The empirical and theoretical analyses demonstrate the efficacy of our approach in mitigating unobserved confounding and improving the overall performance of human-AI collaborations.
