Prior2Posterior: Model Prior Correction for Long-Tailed Learning
S Divakar Bhat, Amit More, Mudit Soni, Surbhi Agrawal
TL;DR
The paper tackles long-tailed recognition by addressing the mismatch between training priors and balanced test distributions. It introduces Prior2Posterior (P2P), a post-hoc correction that estimates the model’s effective prior from its own predictions and adjusts posterior probabilities to align with the test prior, with theoretical optimality for plain CE and logit-adjusted losses. Empirically, P2P achieves state-of-the-art results on CIFAR-LT, ImageNet-LT, and iNaturalist18, and can boost existing methods without retraining while enabling post-hoc bias inspection. The approach is compatible with two-stage decoupled training and can also enhance pre-trained models, making it a practical and broadly applicable tool for mitigating residual bias in long-tailed learning.
Abstract
Learning-based solutions for long-tailed recognition face difficulties in generalizing on balanced test datasets. Due to imbalanced data prior, the learned \textit{a posteriori} distribution is biased toward the most frequent (head) classes, leading to an inferior performance on the least frequent (tail) classes. In general, the performance can be improved by removing such a bias by eliminating the effect of imbalanced prior modeled using the number of class samples (frequencies). We first observe that the \textit{effective prior} on the classes, learned by the model at the end of the training, can differ from the empirical prior obtained using class frequencies. Thus, we propose a novel approach to accurately model the effective prior of a trained model using \textit{a posteriori} probabilities. We propose to correct the imbalanced prior by adjusting the predicted \textit{a posteriori} probabilities (Prior2Posterior: P2P) using the calculated prior in a post-hoc manner after the training, and show that it can result in improved model performance. We present theoretical analysis showing the optimality of our approach for models trained with naive cross-entropy loss as well as logit adjusted loss. Our experiments show that the proposed approach achieves new state-of-the-art (SOTA) on several benchmark datasets from the long-tail literature in the category of logit adjustment methods. Further, the proposed approach can be used to inspect any existing method to capture the \textit{effective prior} and remove any residual bias to improve its performance, post-hoc, without model retraining. We also show that by using the proposed post-hoc approach, the performance of many existing methods can be improved further.
