Table of Contents
Fetching ...

Towards User-Focused Research in Training Data Attribution for Human-Centered Explainable AI

Elisa Nguyen, Johannes Bertram, Evgenii Kortukov, Jean Y. Song, Seong Joon Oh

TL;DR

The paper critiques the technocentric focus of Explainable AI and argues for a user-centered reorientation of training data attribution (TDA). Through a two-stage study with ML developers, it identifies a three-dimensional design space (action, metric, number of samples) for TDA explanations and reveals a strong preference for group-attribution alongside diverse, individual needs. The work demonstrates that existing TDA methods partly address explicit needs but largely miss latent, actionable, and reliability-related requirements, and it proposes a framework for needs-based research to align method development with user workflows. This approach aims to improve the practical relevance and human impact of TDA within HCXAI and invites broader adoption of user-centered design in XAI research and practice.

Abstract

Explainable AI (XAI) aims to make AI systems more transparent, yet many practices emphasise mathematical rigour over practical user needs. We propose an alternative to this model-centric approach by following a design thinking process for the emerging XAI field of training data attribution (TDA), which risks repeating solutionist patterns seen in other subfields. However, because TDA is in its early stages, there is a valuable opportunity to shape its direction through user-centred practices. We engage directly with machine learning developers via a needfinding interview study (N=6) and a scenario-based interactive user study (N=31) to ground explanations in real workflows. Our exploration of the TDA design space reveals novel tasks for data-centric explanations useful to developers, such as grouping training samples behind specific model behaviours or identifying undersampled data. We invite the TDA, XAI, and HCI communities to engage with these tasks to strengthen their research's practical relevance and human impact.

Towards User-Focused Research in Training Data Attribution for Human-Centered Explainable AI

TL;DR

The paper critiques the technocentric focus of Explainable AI and argues for a user-centered reorientation of training data attribution (TDA). Through a two-stage study with ML developers, it identifies a three-dimensional design space (action, metric, number of samples) for TDA explanations and reveals a strong preference for group-attribution alongside diverse, individual needs. The work demonstrates that existing TDA methods partly address explicit needs but largely miss latent, actionable, and reliability-related requirements, and it proposes a framework for needs-based research to align method development with user workflows. This approach aims to improve the practical relevance and human impact of TDA within HCXAI and invites broader adoption of user-centered design in XAI research and practice.

Abstract

Explainable AI (XAI) aims to make AI systems more transparent, yet many practices emphasise mathematical rigour over practical user needs. We propose an alternative to this model-centric approach by following a design thinking process for the emerging XAI field of training data attribution (TDA), which risks repeating solutionist patterns seen in other subfields. However, because TDA is in its early stages, there is a valuable opportunity to shape its direction through user-centred practices. We engage directly with machine learning developers via a needfinding interview study (N=6) and a scenario-based interactive user study (N=31) to ground explanations in real workflows. Our exploration of the TDA design space reveals novel tasks for data-centric explanations useful to developers, such as grouping training samples behind specific model behaviours or identifying undersampled data. We invite the TDA, XAI, and HCI communities to engage with these tasks to strengthen their research's practical relevance and human impact.
Paper Structure (68 sections, 7 equations, 9 figures, 6 tables)

This paper contains 68 sections, 7 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: Comparison of bottom-up and proposed top-down development of research with selected examples from XAI: simonyan2013deepadebayo2018sanityehsan2020hcxai
  • Figure 2: Example for the use of TDA as explanation in a fictional bird classification model development scenario. After training the classifier on the training data, the developer tests the model during the inference stage and inspects model errors (left). TDA explanations identify relevant training data to the misclassification, enabling the developer to build and refine hypotheses about reasons for the model error (right).
  • Figure 3: Codes and themes on developers' current and anticipated practices on training data.
  • Figure 4: Three-dimensional design space of TDA explanations representing the type of data-centric information useful in model development. Each axis is derived from the themes and codes found in the interviews.
  • Figure 5: Histograms of answer distributions for the preliminary demographic questions of the scenario-based interactive study (N=31). DL = Deep learning.
  • ...and 4 more figures