Realizable $H$-Consistent and Bayes-Consistent Loss Functions for Learning to Defer

Anqi Mao; Mehryar Mohri; Yutao Zhong

Realizable $H$-Consistent and Bayes-Consistent Loss Functions for Learning to Defer

Anqi Mao, Mehryar Mohri, Yutao Zhong

TL;DR

A broad family of surrogate losses, parameterized by a non-increasing function $\Psi$, are introduced, and it is shown that these losses admit $H-consistency bounds when the hypothesis set is symmetric and complete, a property satisfied by common neural network and linear function hypothesis sets.

Abstract

We present a comprehensive study of surrogate loss functions for learning to defer. We introduce a broad family of surrogate losses, parameterized by a non-increasing function $Ψ$, and establish their realizable $H$-consistency under mild conditions. For cost functions based on classification error, we further show that these losses admit $H$-consistency bounds when the hypothesis set is symmetric and complete, a property satisfied by common neural network and linear function hypothesis sets. Our results also resolve an open question raised in previous work (Mozannar et al., 2023) by proving the realizable $H$-consistency and Bayes-consistency of a specific surrogate loss. Furthermore, we identify choices of $Ψ$ that lead to $H$-consistent surrogate losses for any general cost function, thus achieving Bayes-consistency, realizable $H$-consistency, and $H$-consistency bounds simultaneously. We also investigate the relationship between $H$-consistency bounds and realizable $H$-consistency in learning to defer, highlighting key differences from standard classification. Finally, we empirically evaluate our proposed surrogate losses and compare them with existing baselines.

Realizable $H$-Consistent and Bayes-Consistent Loss Functions for Learning to Defer

TL;DR

A broad family of surrogate losses, parameterized by a non-increasing function

, are introduced, and it is shown that these losses admit $H-consistency bounds when the hypothesis set is symmetric and complete, a property satisfied by common neural network and linear function hypothesis sets.

Abstract

We present a comprehensive study of surrogate loss functions for learning to defer. We introduce a broad family of surrogate losses, parameterized by a non-increasing function

, and establish their realizable

-consistency under mild conditions. For cost functions based on classification error, we further show that these losses admit

-consistency bounds when the hypothesis set is symmetric and complete, a property satisfied by common neural network and linear function hypothesis sets. Our results also resolve an open question raised in previous work (Mozannar et al., 2023) by proving the realizable

-consistency and Bayes-consistency of a specific surrogate loss. Furthermore, we identify choices of

that lead to

-consistent surrogate losses for any general cost function, thus achieving Bayes-consistency, realizable

-consistency, and

-consistency bounds simultaneously. We also investigate the relationship between

-consistency bounds and realizable

-consistency in learning to defer, highlighting key differences from standard classification. Finally, we empirically evaluate our proposed surrogate losses and compare them with existing baselines.

Paper Structure