Quantifying Divergence for Human-AI Collaboration and Cognitive Trust
Müge Kural, Ali Gebeşçe, Tilek Chubakov, Gözde Gül Şahin
TL;DR
This study investigates how human-AI collaboration and cognitive trust relate to decision-making similarity between humans and AI models. By measuring divergence-based distances between human soft-label distributions and model outputs on an SNLI entailment task, the authors show that users tend to collaborate with the most similar model (as captured by Jensen-Shannon Distance), but cognitive trust does not consistently track this similarity. The work introduces a four-stage user study and analyzes forward and inverse KL divergences as well as JSD to characterize different alignment regimes, uncovering that low inverse KL drives collaboration while trust may require avoiding overconfidence (captured by alpha KL and JSD). These findings provide a framework for pre-deployment evaluation of AI partners and guide future optimization of models for collaboration and trust.
Abstract
Predicting the collaboration likelihood and measuring cognitive trust to AI systems is more important than ever. To do that, previous research mostly focus solely on the model features (e.g., accuracy, confidence) and ignore the human factor. To address that, we propose several decision-making similarity measures based on divergence metrics (e.g., KL, JSD) calculated over the labels acquired from humans and a wide range of models. We conduct a user study on a textual entailment task, where the users are provided with soft labels from various models and asked to pick the closest option to them. The users are then shown the similarities/differences to their most similar model and are surveyed for their likelihood of collaboration and cognitive trust to the selected system. Finally, we qualitatively and quantitatively analyze the relation between the proposed decision-making similarity measures and the survey results. We find that people tend to collaborate with their most similar models -- measured via JSD -- yet this collaboration does not necessarily imply a similar level of cognitive trust. We release all resources related to the user study (e.g., design, outputs), models, and metrics at our repo.
