Coverage-Constrained Human-AI Cooperation with Multiple Experts
Zheng Zhang, Cuong Nguyen, Kevin Wells, Thanh-Toan Do, David Rosewarne, Gustavo Carneiro
TL;DR
The paper tackles the challenge of high-stakes decision making by designing CL2DC, a coverage-constrained framework that unifies learning-to-defer and learning-to-complement with specific experts in multi-expert, multi-label-noise settings. It introduces a gating mechanism and a complementary module together with a penalty-based constraint to enforce a target AI-only coverage while training on pseudo-clean labels produced by CrowdLab, enabling robust optimization. Empirical results across synthetic and real-world datasets show CL2DC consistently outperforms state-of-the-art HAI-CC methods in accuracy at matched coverage, validating the approach's effectiveness for workload management and reliable expert-AI cooperation. The work advances practical MEHAI-CC by enabling precise control over AI reliance, expert-specific collaboration, and principled evaluation, with future directions including sequential expert strategies and handling imbalanced datasets.
Abstract
Human-AI cooperative classification (HAI-CC) approaches aim to develop hybrid intelligent systems that enhance decision-making in various high-stakes real-world scenarios by leveraging both human expertise and AI capabilities. Current HAI-CC methods primarily focus on learning-to-defer (L2D), where decisions are deferred to human experts, and learning-to-complement (L2C), where AI and human experts make predictions cooperatively. However, a notable research gap remains in effectively exploring both L2D and L2C under diverse expert knowledge to improve decision-making, particularly when constrained by the cooperation cost required to achieve a target probability for AI-only selection (i.e., coverage). In this paper, we address this research gap by proposing the Coverage-constrained Learning to Defer and Complement with Specific Experts (CL2DC) method. CL2DC makes final decisions through either AI prediction alone or by deferring to or complementing a specific expert, depending on the input data. Furthermore, we propose a coverage-constrained optimisation to control the cooperation cost, ensuring it approximates a target probability for AI-only selection. This approach enables an effective assessment of system performance within a specified budget. Also, CL2DC is designed to address scenarios where training sets contain multiple noisy-label annotations without any clean-label references. Comprehensive evaluations on both synthetic and real-world datasets demonstrate that CL2DC achieves superior performance compared to state-of-the-art HAI-CC methods.
