Learning to Collaborate: A Capability Vectors-based Architecture for Adaptive Human-AI Decision Making
Renlong Jie
TL;DR
This work tackles adaptive human-AI decision making by introducing learnable capability vectors that uniformly encode the decision-making proficiencies of both humans and AI models. A transformer-based weight generator computes instance-specific aggregation weights, enabling a principled, end-to-end fusion of outputs from heterogeneous agents via a final score vector with $s_f = w S$. The authors also present a learning-free global baseline for case studies and demonstrate superior performance across image classification and hate speech detection, including real human labels (CIFAR-10H, GalaxyZoo) and synthetic expert scenarios. The approach demonstrates strong robustness, scalability, and practical potential for crowdsourcing, expert selection, and large-scale multi-task settings. Overall, capability vectors offer a unified, extensible framework for multi-agent collaboration with tangible gains in decision accuracy and applicability to real-world decision-making pipelines.
Abstract
Effective human-AI collaboration hinges on the ability to dynamically integrate the complementary strengths of human experts and AI models across diverse decision contexts. Context-aware weighted combination of human and AI outputs is a promising technique, which involves the optimization of combination weights based on capabilities of decision agents on a given task. However, existing approaches treat humans and AI as isolated entities, lacking a unified representation to model the heterogeneous capabilities of multiple decision agents. To address this gap, we propose a novel capability-aware architecture that models both human and AI decision-makers using learnable capability vectors. These vectors encode task-relevant competencies in a shared latent space and are used by a transformer-based weight generation module to produce instance-specific aggregation weights. Our framework supports flexible integration of confidence scores or one-hot decisions from a variable number of agents. We further introduce a learning-free baseline using optimized global weights for human-AI collaboration. Extensive experiments on image classification and hate speech detection tasks demonstrate that our approach outperforms state-of-the-art methods under various collaboration settings with both simulated and real human labels. The results highlight the robustness, scalability, and superior accuracy of our method, underscoring its potential for real-world applications.
