DeCoDe: Defer-and-Complement Decision-Making via Decoupled Concept Bottleneck Models
Chengbo He, Bochao Zou, Junliang Xing, Jiansheng Chen, Yuanchun Shi, Huimin Ma
TL;DR
DeCoDe addresses the challenge of deciding when to rely on AI, defer to humans, or combine both in high-stakes decision making. It introduces a decoupled concept bottleneck framework where explicit concept representations $c_{exp}$ and latent $c_{imp}$ form the basis for both prediction and strategy selection via a gating network $g_\phi$, mapping to a three-way distribution over {AI-only, AI+Human, Defer-to-Human}. A surrogate loss $\mathcal{L}_{DeCoDe}$ balances predictive accuracy and the cost of human involvement, enabling instance-specific adaptation even with noisy or inconsistent expert input. Empirical results on CUB-200-2011, Derm7pt, and CelebA show DeCoDe outperforms AI-only, human-only, and conventional deferral baselines while preserving interpretability and robustness, and it supports test-time concept interventions for user-driven correction.
Abstract
In human-AI collaboration, a central challenge is deciding whether the AI should handle a task, be deferred to a human expert, or be addressed through collaborative effort. Existing Learning to Defer approaches typically make binary choices between AI and humans, neglecting their complementary strengths. They also lack interpretability, a critical property in high-stakes scenarios where users must understand and, if necessary, correct the model's reasoning. To overcome these limitations, we propose Defer-and-Complement Decision-Making via Decoupled Concept Bottleneck Models (DeCoDe), a concept-driven framework for human-AI collaboration. DeCoDe makes strategy decisions based on human-interpretable concept representations, enhancing transparency throughout the decision process. It supports three flexible modes: autonomous AI prediction, deferral to humans, and human-AI collaborative complementarity, selected via a gating network that takes concept-level inputs and is trained using a novel surrogate loss that balances accuracy and human effort. This approach enables instance-specific, interpretable, and adaptive human-AI collaboration. Experiments on real-world datasets demonstrate that DeCoDe significantly outperforms AI-only, human-only, and traditional deferral baselines, while maintaining strong robustness and interpretability even under noisy expert annotations.
