Know When to Abstain: Optimal Selective Classification with Likelihood Ratios
Alvin Heng, Harold Soh
TL;DR
This work tackles selective classification under covariate shift by recasting abstention as a Neyman–Pearson hypothesis test, showing that the optimal selector is a monotone transformation of the likelihood ratio $p_c({\mathbf{x}})/p_w({\mathbf{x}})$ and unifying prior post-hoc scores under this framework. It introduces two distance-based NP-optimal selectors, $\Delta$-MDS and $\Delta$-KNN, along with a simple linear combination strategy that blends distance- and logit-based scores (e.g., $\Delta$-MDS-RLog, $\Delta$-KNN-RLog). The paper provides theoretical justification for these scores and demonstrates strong empirical gains on vision and language benchmarks under covariate shift, including with vision-language models like CLIP and EVA. The results highlight the practical impact of likelihood-ratio-based selective classification for robust deployment in real-world, distribution-shifted settings, with public code available for replication.
Abstract
Selective classification enhances the reliability of predictive models by allowing them to abstain from making uncertain predictions. In this work, we revisit the design of optimal selection functions through the lens of the Neyman--Pearson lemma, a classical result in statistics that characterizes the optimal rejection rule as a likelihood ratio test. We show that this perspective not only unifies the behavior of several post-hoc selection baselines, but also motivates new approaches to selective classification which we propose here. A central focus of our work is the setting of covariate shift, where the input distribution at test time differs from that at training. This realistic and challenging scenario remains relatively underexplored in the context of selective classification. We evaluate our proposed methods across a range of vision and language tasks, including both supervised learning and vision-language models. Our experiments demonstrate that our Neyman--Pearson-informed methods consistently outperform existing baselines, indicating that likelihood ratio-based selection offers a robust mechanism for improving selective classification under covariate shifts. Our code is publicly available at https://github.com/clear-nus/sc-likelihood-ratios.
