Confidence-aware Contrastive Learning for Selective Classification
Yu-Chang Wu, Shen-Huan Lyu, Haopu Shang, Xiangyu Wang, Chao Qian
TL;DR
This paper tackles selective classification by deriving a generalization bound that links predictive confidence and feature representation. It introduces CCL-SC, a confidence-aware contrastive learning framework that optimizes the feature space to pull together correctly classified samples and push apart misclassified ones, with reweighting by the model's predictive confidence. The method employs a MoCo-style dual-queue setup to construct positive and negative samples and defines a CSC loss that integrates SR into the contrastive objective. Empirically, CCL-SC achieves lower selective risk than state-of-the-art methods across CIFAR-10/100, CelebA, and ImageNet on most coverage levels and can be effectively combined with existing selective-classification techniques to achieve further gains. Overall, the study demonstrates the practical value of feature-level optimization for selective classification and offers a principled avenue to improve reliability in high-stakes applications.
Abstract
Selective classification enables models to make predictions only when they are sufficiently confident, aiming to enhance safety and reliability, which is important in high-stakes scenarios. Previous methods mainly use deep neural networks and focus on modifying the architecture of classification layers to enable the model to estimate the confidence of its prediction. This work provides a generalization bound for selective classification, disclosing that optimizing feature layers helps improve the performance of selective classification. Inspired by this theory, we propose to explicitly improve the selective classification model at the feature level for the first time, leading to a novel Confidence-aware Contrastive Learning method for Selective Classification, CCL-SC, which similarizes the features of homogeneous instances and differentiates the features of heterogeneous instances, with the strength controlled by the model's confidence. The experimental results on typical datasets, i.e., CIFAR-10, CIFAR-100, CelebA, and ImageNet, show that CCL-SC achieves significantly lower selective risk than state-of-the-art methods, across almost all coverage degrees. Moreover, it can be combined with existing methods to bring further improvement.
