Table of Contents
Fetching ...

DeCoDe: Defer-and-Complement Decision-Making via Decoupled Concept Bottleneck Models

Chengbo He, Bochao Zou, Junliang Xing, Jiansheng Chen, Yuanchun Shi, Huimin Ma

TL;DR

DeCoDe addresses the challenge of deciding when to rely on AI, defer to humans, or combine both in high-stakes decision making. It introduces a decoupled concept bottleneck framework where explicit concept representations $c_{exp}$ and latent $c_{imp}$ form the basis for both prediction and strategy selection via a gating network $g_\phi$, mapping to a three-way distribution over {AI-only, AI+Human, Defer-to-Human}. A surrogate loss $\mathcal{L}_{DeCoDe}$ balances predictive accuracy and the cost of human involvement, enabling instance-specific adaptation even with noisy or inconsistent expert input. Empirical results on CUB-200-2011, Derm7pt, and CelebA show DeCoDe outperforms AI-only, human-only, and conventional deferral baselines while preserving interpretability and robustness, and it supports test-time concept interventions for user-driven correction.

Abstract

In human-AI collaboration, a central challenge is deciding whether the AI should handle a task, be deferred to a human expert, or be addressed through collaborative effort. Existing Learning to Defer approaches typically make binary choices between AI and humans, neglecting their complementary strengths. They also lack interpretability, a critical property in high-stakes scenarios where users must understand and, if necessary, correct the model's reasoning. To overcome these limitations, we propose Defer-and-Complement Decision-Making via Decoupled Concept Bottleneck Models (DeCoDe), a concept-driven framework for human-AI collaboration. DeCoDe makes strategy decisions based on human-interpretable concept representations, enhancing transparency throughout the decision process. It supports three flexible modes: autonomous AI prediction, deferral to humans, and human-AI collaborative complementarity, selected via a gating network that takes concept-level inputs and is trained using a novel surrogate loss that balances accuracy and human effort. This approach enables instance-specific, interpretable, and adaptive human-AI collaboration. Experiments on real-world datasets demonstrate that DeCoDe significantly outperforms AI-only, human-only, and traditional deferral baselines, while maintaining strong robustness and interpretability even under noisy expert annotations.

DeCoDe: Defer-and-Complement Decision-Making via Decoupled Concept Bottleneck Models

TL;DR

DeCoDe addresses the challenge of deciding when to rely on AI, defer to humans, or combine both in high-stakes decision making. It introduces a decoupled concept bottleneck framework where explicit concept representations and latent form the basis for both prediction and strategy selection via a gating network , mapping to a three-way distribution over {AI-only, AI+Human, Defer-to-Human}. A surrogate loss balances predictive accuracy and the cost of human involvement, enabling instance-specific adaptation even with noisy or inconsistent expert input. Empirical results on CUB-200-2011, Derm7pt, and CelebA show DeCoDe outperforms AI-only, human-only, and conventional deferral baselines while preserving interpretability and robustness, and it supports test-time concept interventions for user-driven correction.

Abstract

In human-AI collaboration, a central challenge is deciding whether the AI should handle a task, be deferred to a human expert, or be addressed through collaborative effort. Existing Learning to Defer approaches typically make binary choices between AI and humans, neglecting their complementary strengths. They also lack interpretability, a critical property in high-stakes scenarios where users must understand and, if necessary, correct the model's reasoning. To overcome these limitations, we propose Defer-and-Complement Decision-Making via Decoupled Concept Bottleneck Models (DeCoDe), a concept-driven framework for human-AI collaboration. DeCoDe makes strategy decisions based on human-interpretable concept representations, enhancing transparency throughout the decision process. It supports three flexible modes: autonomous AI prediction, deferral to humans, and human-AI collaborative complementarity, selected via a gating network that takes concept-level inputs and is trained using a novel surrogate loss that balances accuracy and human effort. This approach enables instance-specific, interpretable, and adaptive human-AI collaboration. Experiments on real-world datasets demonstrate that DeCoDe significantly outperforms AI-only, human-only, and traditional deferral baselines, while maintaining strong robustness and interpretability even under noisy expert annotations.

Paper Structure

This paper contains 12 sections, 9 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Comparison between traditional Learning to Defer (left) and our proposed DeCoDe framework (right). While conventional deferral methods make binary AI–human decisions without providing explanations or enabling collaboration, DeCoDe leverages interpretable concept representations to support flexible strategy selection—autonomous AI, human-only, and human-AI complementarity—guided by a concept-driven gating network.
  • Figure 2: Overall architecture of the DeCoDe framework. The model is built upon a concept bottleneck structure, where explicit concepts are extracted from input images to form intermediate representations. These representations are used for downstream task prediction and strategy selection, enabling the model to adaptively choose among three decision modes: autonomous AI prediction, deferral to humans, and AI-human collaboration. The dashed box at the bottom illustrates the interactive mechanism of the system: users can intervene on the concept layer to correct the prediction or adjust the concept representation by rectifying the model’s output, thereby enhancing interpretability and controllability.
  • Figure 3: System accuracy as a function of human participation ratio for DeCoDe and baseline methods under different noise levels. Each row corresponds to a dataset (CUB, Derm7pt, CelebA), and each column represents a simulated human noise rate (0.1, 0.3, 0.5). The x-axis denotes the proportion of test instances receiving human input, from 0 (AI-only) to 1 (Human-only). Dashed and dash-dotted lines indicate the standalone performance of human and AI agents, respectively.
  • Figure 4: Effect of concept-level intervention on downstream predictions. The figure illustrates how manually modifying intermediate concepts can alter the model's final prediction. The left shows the input image, the middle shows predicted and intervened concept values, and the right shows label predictions before and after intervention.
  • Figure 5: Analysis of concept-based strategy selection in DeCoDe. (a) Coverage–accuracy curves on the CUB dataset under simulated 70% human accuracy. Concept-based gating (red) achieves better stability and overall accuracy as collaboration increases, while image-based gating (blue) performs slightly better at low coverage. (b) Heatmap of soft-aggregated concept activations across strategies, revealing distinct semantic preferences for AI-only, AI+Human, and Human-only paths. These results illustrate DeCoDe’s semantic adaptability in selecting collaboration modes.