Table of Contents
Fetching ...

Beyond Logit Adjustment: A Residual Decomposition Framework for Long-Tailed Reranking

Zhanliang Wang, Hongzhuo Chen, Quan Minh Nguyen, Mian Umair Ahsan, Kai Wang

Abstract

Long-tailed classification, where a small number of frequent classes dominate many rare ones, remains challenging because models systematically favor frequent classes at inference time. Existing post-hoc methods such as logit adjustment address this by adding a fixed classwise offset to the base-model logits. However, the correction required to restore the relative ranking of two classes need not be constant across inputs, and a fixed offset cannot adapt to such variation. We study this problem through Bayes-optimal reranking on a base-model top-k shortlist. The gap between the optimal score and the base score, the residual correction, decomposes into a classwise component that is constant within each class, and a pairwise component that depends on the input and competing labels. When the residual is purely classwise, a fixed offset suffices to recover the Bayes-optimal ordering. We further show that when the same label pair induces incompatible ordering constraints across contexts, no fixed offset can achieve this recovery. This decomposition leads to testable predictions regarding when pairwise correction can improve performance and when cannot. We develop REPAIR (Reranking via Pairwise residual correction), a lightweight post-hoc reranker that combines a shrinkage-stabilized classwise term with a linear pairwise term driven by competition features on the shortlist. Experiments on five benchmarks spanning image classification, species recognition, scene recognition, and rare disease diagnosis confirm that the decomposition explains where pairwise correction helps and where classwise correction alone suffices.

Beyond Logit Adjustment: A Residual Decomposition Framework for Long-Tailed Reranking

Abstract

Long-tailed classification, where a small number of frequent classes dominate many rare ones, remains challenging because models systematically favor frequent classes at inference time. Existing post-hoc methods such as logit adjustment address this by adding a fixed classwise offset to the base-model logits. However, the correction required to restore the relative ranking of two classes need not be constant across inputs, and a fixed offset cannot adapt to such variation. We study this problem through Bayes-optimal reranking on a base-model top-k shortlist. The gap between the optimal score and the base score, the residual correction, decomposes into a classwise component that is constant within each class, and a pairwise component that depends on the input and competing labels. When the residual is purely classwise, a fixed offset suffices to recover the Bayes-optimal ordering. We further show that when the same label pair induces incompatible ordering constraints across contexts, no fixed offset can achieve this recovery. This decomposition leads to testable predictions regarding when pairwise correction can improve performance and when cannot. We develop REPAIR (Reranking via Pairwise residual correction), a lightweight post-hoc reranker that combines a shrinkage-stabilized classwise term with a linear pairwise term driven by competition features on the shortlist. Experiments on five benchmarks spanning image classification, species recognition, scene recognition, and rare disease diagnosis confirm that the decomposition explains where pairwise correction helps and where classwise correction alone suffices.

Paper Structure

This paper contains 79 sections, 5 theorems, 31 equations, 8 figures, 9 tables, 1 algorithm.

Key Result

Proposition 4.2

If the setting is class-separable, then the fixed-offset score $r_y(x,S) = g_y(x) + a_y$ induces the Bayes-optimal ordering on every covered example. $\blacktriangleleft$$\blacktriangleleft$

Figures (8)

  • Figure 1: Motivating example (shortlist size $2$ for clarity). Each row is a different input; bars show scores and percentages indicate softmax probabilities. A fixed small offset corrects the easy case (row 1) but under-corrects the hard case (row 2); a fixed large offset corrects both but flips the already-correct case (row 3). Repair adapts the correction magnitude per input, producing the correct ranking in all three cases.
  • Figure 2: Synthetic validation ($K{=}100$, $k{=}10$). (a) Class-separable regime: classwise correction suffices ($\rho_k = 0.17$ for both), confirming Proposition \ref{['prop:classwise-special']}. (b) Non-class-separable regime with contradictory pairs: Repair closes $2.3\times$ the recoverable gap ($\rho_k = 0.42$ vs. $0.18$), consistent with Theorem \ref{['thm:pairwise-necessary']}.
  • Figure 3: $\Delta$Hit@1 of REPAIR over Classwise by mean $D_y$ quintile (synthetic, $K{=}100$, $k{=}10$, rare classes). (a) Class-separable: almost flat across quintiles. (b) Non-class-separable: monotone increasing from Q1 to Q5.
  • Figure 4: Training-set class frequency distributions. Classes are sorted by frequency (descending); the vertical dashed line marks the 80/20 rare/frequent boundary.
  • Figure 5: Shrinkage diagnostic for the classwise correction. Left axis: mean offset variance for the raw MLE and the shrunk estimate. Right axis: $\Delta$Hit@1. Shrinkage is most helpful when the number of covered calibration examples per class is very small.
  • ...and 3 more figures

Theorems & Definitions (13)

  • Definition 4.1: Class-separable and non-class-separable regimes
  • Proposition 4.2: Classwise correction as an exact special case
  • Theorem 4.3: Necessity of pairwise correction
  • Definition 6.1: Hardest-rival flip rate
  • Proposition C.1: Bayes rule on a fixed shortlist
  • proof
  • Proposition C.2: Plug-in control
  • Lemma C.3: Covered excess-risk identity
  • proof
  • proof : Proof of Proposition (Plug-in control)
  • ...and 3 more