Attribution Gradients: Incrementally Unfolding Citations for Critical Examination of Attributed AI Answers
Hita Kambhamettu, Alyssa Hwang, Philippe Laban, Andrew Head
TL;DR
This work addresses the challenge of verifying attributed AI answers by introducing attribution gradients, a design that incrementally unfolds context from generated claims to contextualized evidence across sources. The authors implement a research prototype built on OpenSciLM and a citation graph pipeline, enabling decomposition of sentences into atomic claims, extraction of supporting and contradicting excerpts (including second-degree references), and in-situ access to source passages with contextual explanations. A within-subject usability study (n=20) shows attribution gradients increase engagement with sources and yield higher-quality revisions, with more facts added and more corrections made, albeit with some misclassifications that do not propagate to final outputs. The results suggest attribution gradients can improve sensemaking and critical examination of AI-generated science answers, offering a practical path toward more transparent, verifiable AI-assisted inquiry in scholarly contexts.
Abstract
AI question answering systems increasingly generate responses with attributions to sources. However, the task of verifying the actual content of these attributions is in most cases impractical. In this paper, we present attribution gradients as a solution. Attribution gradients provide integrated, incremental affordances for diving into an attributed passage. A user can decompose a sentence of an answer into its claims. For each claim, the user can view supporting and contradictory excerpts mined from sources. Those excerpts serve as clickable conduits into the source (in our application, scientific papers). When evidence itself contains more citations, the UI unpacks the evidence into excerpts from the cited sources. These features of attribution gradients facilitate concurrent interconnections among answer, claim, excerpt, and context. In a usability study, we observed greater engagement with sources and richer revision in a task where participants revised an attributed AI answer with attribution gradients and a baseline.
