Table of Contents
Fetching ...

Robust Label Shift Quantification

Alexandre Lecestre

TL;DR

This work addresses the problem of robust label shift quantification under distribution shift by formulating two practical settings and showing that robust estimators align with Maximum Likelihood estimators via $\rho$-estimation grounded in the Hellinger distance. It provides finite-sample deviation bounds and convergence rates (notably $n^{-1/2}\log^{1/2} n$ in well-specified cases) and demonstrates robustness to misspecification, contamination, and outliers. The paper also develops predictor-based and unconditional emission-density frameworks, connects to calibration concepts, and links to existing methods such as MLLS, BBSE, and KMM through a unified theoretical treatment. Practically, these results justify the robustness of MLLS and offer principled guidance for reliable label-shift quantification in high-dimensional and noisy settings.

Abstract

In this paper, we investigate the label shift quantification problem. We propose robust estimators of the label distribution which turn out to coincide with the Maximum Likelihood Estimator. We analyze the theoretical aspects and derive deviation bounds for the proposed method, providing optimal guarantees in the well-specified case, along with notable robustness properties against outliers and contamination. Our results provide theoretical validation for empirical observations on the robustness of Maximum Likelihood Label Shift.

Robust Label Shift Quantification

TL;DR

This work addresses the problem of robust label shift quantification under distribution shift by formulating two practical settings and showing that robust estimators align with Maximum Likelihood estimators via -estimation grounded in the Hellinger distance. It provides finite-sample deviation bounds and convergence rates (notably in well-specified cases) and demonstrates robustness to misspecification, contamination, and outliers. The paper also develops predictor-based and unconditional emission-density frameworks, connects to calibration concepts, and links to existing methods such as MLLS, BBSE, and KMM through a unified theoretical treatment. Practically, these results justify the robustness of MLLS and offer principled guidance for reliable label-shift quantification in high-dimensional and noisy settings.

Abstract

In this paper, we investigate the label shift quantification problem. We propose robust estimators of the label distribution which turn out to coincide with the Maximum Likelihood Estimator. We analyze the theoretical aspects and derive deviation bounds for the proposed method, providing optimal guarantees in the well-specified case, along with notable robustness properties against outliers and contamination. Our results provide theoretical validation for empirical observations on the robustness of Maximum Likelihood Label Shift.

Paper Structure

This paper contains 29 sections, 15 theorems, 94 equations.

Key Result

Proposition 3.1

When it exists, the Maximum Likelihood Estimator $\hat{\beta}_{MLE}$ given by is a $\rho$-estimator with respect to $\mathcal{M}_{mix}(q_1,\dots,q_k)$.

Theorems & Definitions (18)

  • Example
  • Proposition 3.1
  • Lemma 3.2
  • Lemma 3.3
  • Theorem 3.4
  • Corollary 3.5
  • Proposition 4.2
  • Theorem 4.3
  • Definition 4.4
  • Proposition 4.6
  • ...and 8 more