Hierarchical Selective Classification
Shani Goren, Ido Galil, Ran El-Yaniv
TL;DR
Hierarchical selective classification (HSC) extends selective classification to hierarchies, enabling predictions to be made at varying levels of specificity based on uncertainty. The approach formalizes hierarchical risk and coverage, introduces hierarchical risk-coverage curves, and develops hierarchical inference rules (notably Climbing) paired with an optimal-threshold algorithm that guarantees a user-specified accuracy with high probability using a calibration set. Empirical results on over 1,100 ImageNet models and iNat21 models show substantial improvements in hAURC when leveraging hierarchy-aware predictions, with CLIP-based regimes and large-scale pretraining delivering the largest gains; hierarchical calibration also improves. The work situates HSC as a practical, post-hoc method that improves risk control and interpretability in hierarchical classification tasks, with future directions exploring alternative confidence scores and selective hierarchical training.
Abstract
Deploying deep neural networks for risk-sensitive tasks necessitates an uncertainty estimation mechanism. This paper introduces hierarchical selective classification, extending selective classification to a hierarchical setting. Our approach leverages the inherent structure of class relationships, enabling models to reduce the specificity of their predictions when faced with uncertainty. In this paper, we first formalize hierarchical risk and coverage, and introduce hierarchical risk-coverage curves. Next, we develop algorithms for hierarchical selective classification (which we refer to as "inference rules"), and propose an efficient algorithm that guarantees a target accuracy constraint with high probability. Lastly, we conduct extensive empirical studies on over a thousand ImageNet classifiers, revealing that training regimes such as CLIP, pretraining on ImageNet21k and knowledge distillation boost hierarchical selective performance.
