Table of Contents
Fetching ...

The Impact of Concept Explanations and Interventions on Human-Machine Collaboration

Jack Furby, Dan Cunnington, Dave Braines, Alun Preece

TL;DR

The paper investigates Concept Bottleneck Models (CBMs) in human-machine collaboration, addressing the gap that prior CBM work lacked real human-in-the-loop evaluation. It runs two studies (expert dermatology and lay Blackjack) to assess whether test-time interventions, concept explanations, and saliency maps improve task accuracy, interpretability, and trust. Results show CBMs enhance interpretability and human-machine alignment, and interventions can raise trust, but they do not consistently improve task accuracy and can even reduce performance in some conditions. The findings highlight a misalignment between human concepts and model concept representations, stressing the need for better alignment and more user-friendly interfaces to realize practical benefits of CBMs in collaborative settings.

Abstract

Deep Neural Networks (DNNs) are often considered black boxes due to their opaque decision-making processes. To reduce their opacity Concept Models (CMs), such as Concept Bottleneck Models (CBMs), were introduced to predict human-defined concepts as an intermediate step before predicting task labels. This enhances the interpretability of DNNs. In a human-machine setting greater interpretability enables humans to improve their understanding and build trust in a DNN. In the introduction of CBMs, the models demonstrated increased task accuracy as incorrect concept predictions were replaced with their ground truth values, known as intervening on the concept predictions. In a collaborative setting, if the model task accuracy improves from interventions, trust in a model and the human-machine task accuracy may increase. However, the result showing an increase in model task accuracy was produced without human evaluation and thus it remains unknown if the findings can be applied in a collaborative setting. In this paper, we ran the first human studies using CBMs to evaluate their human interaction in collaborative task settings. Our findings show that CBMs improve interpretability compared to standard DNNs, leading to increased human-machine alignment. However, this increased alignment did not translate to a significant increase in task accuracy. Understanding the model's decision-making process required multiple interactions, and misalignment between the model's and human decision-making processes could undermine interpretability and model effectiveness.

The Impact of Concept Explanations and Interventions on Human-Machine Collaboration

TL;DR

The paper investigates Concept Bottleneck Models (CBMs) in human-machine collaboration, addressing the gap that prior CBM work lacked real human-in-the-loop evaluation. It runs two studies (expert dermatology and lay Blackjack) to assess whether test-time interventions, concept explanations, and saliency maps improve task accuracy, interpretability, and trust. Results show CBMs enhance interpretability and human-machine alignment, and interventions can raise trust, but they do not consistently improve task accuracy and can even reduce performance in some conditions. The findings highlight a misalignment between human concepts and model concept representations, stressing the need for better alignment and more user-friendly interfaces to realize practical benefits of CBMs in collaborative settings.

Abstract

Deep Neural Networks (DNNs) are often considered black boxes due to their opaque decision-making processes. To reduce their opacity Concept Models (CMs), such as Concept Bottleneck Models (CBMs), were introduced to predict human-defined concepts as an intermediate step before predicting task labels. This enhances the interpretability of DNNs. In a human-machine setting greater interpretability enables humans to improve their understanding and build trust in a DNN. In the introduction of CBMs, the models demonstrated increased task accuracy as incorrect concept predictions were replaced with their ground truth values, known as intervening on the concept predictions. In a collaborative setting, if the model task accuracy improves from interventions, trust in a model and the human-machine task accuracy may increase. However, the result showing an increase in model task accuracy was produced without human evaluation and thus it remains unknown if the findings can be applied in a collaborative setting. In this paper, we ran the first human studies using CBMs to evaluate their human interaction in collaborative task settings. Our findings show that CBMs improve interpretability compared to standard DNNs, leading to increased human-machine alignment. However, this increased alignment did not translate to a significant increase in task accuracy. Understanding the model's decision-making process required multiple interactions, and misalignment between the model's and human decision-making processes could undermine interpretability and model effectiveness.

Paper Structure

This paper contains 20 sections, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Model output variations.
  • Figure 2: Study interfaces with key components labeled.
  • Figure 3: Example samples from the datasets.
  • Figure 4: Interventions performed declined over time except for incorrectly predicted concepts in the lay-person study where the number of interventions performed remains constant.
  • Figure 5: Interventions decrease model task accuracy in the expert study and the layperson study accurate model while increasing model task accuracy with the lay-person study inaccurate model. Concept precision and recall increase with interventions in most cases.