Comments on Friedman's Method for Class Distribution Estimation

Dirk Tasche

Comments on Friedman's Method for Class Distribution Estimation

Dirk Tasche

TL;DR

This work reframes class distribution estimation under prior probability shift as a linear-system design problem and analyzes Friedman's method within a covariance-based framework. It proves fundamental limits: the full training-posterior covariance matrix $\Sigma_P$ is singular, preventing a unique $\ell\times\ell$ solution, and shows how an $\ell-1$ equation, invertible-covariance approach yields unique estimates; it also connects DeBias and PAC as population-equivalent binary instances of a covariance-based multivariate method. The paper further elucidates that DeBias and PAC coincide in the population, and situates Friedman's method as a robust, implementation-light alternative that can outperform or match other methods depending on the test-prior regime. In a semi-asymptotic binary setting, maximum likelihood remains the most efficient, while Friedman’s method offers more uniform performance across $q_1$, highlighting practical trade-offs between variance and prior-independence in quantification tasks.

Abstract

The purpose of class distribution estimation (also known as quantification) is to determine the values of the prior class probabilities in a test dataset without class label observations. A variety of methods to achieve this have been proposed in the literature, most of them based on the assumption that the distributions of the training and test data are related through prior probability shift (also known as label shift). Among these methods, Friedman's method has recently been found to perform relatively well both for binary and multi-class quantification. We discuss the properties of Friedman's method and another approach mentioned by Friedman (called DeBias method in the literature) in the context of a general framework for designing linear equation systems for class distribution estimation.

Comments on Friedman's Method for Class Distribution Estimation

TL;DR

is singular, preventing a unique

solution, and shows how an

equation, invertible-covariance approach yields unique estimates; it also connects DeBias and PAC as population-equivalent binary instances of a covariance-based multivariate method. The paper further elucidates that DeBias and PAC coincide in the population, and situates Friedman's method as a robust, implementation-light alternative that can outperform or match other methods depending on the test-prior regime. In a semi-asymptotic binary setting, maximum likelihood remains the most efficient, while Friedman’s method offers more uniform performance across

, highlighting practical trade-offs between variance and prior-independence in quantification tasks.

Abstract

Paper Structure (8 sections, 4 theorems, 27 equations, 1 figure)

This paper contains 8 sections, 4 theorems, 27 equations, 1 figure.

Introduction
Setting
Linear equations for class distribution estimation
Friedman's method
Uniqueness of solutions and covariance matrix-based approaches
How many equations are needed?
Invertible covariance matrices
Comparing asymptotic variances

Key Result

theorem thmcountertheorem

Let $p_y = P[Y=y]$ and $q_y = Q[Y=y]$ for $y \in \mathcal{Y}$. Suppose that $P$ and $Q$ are related through prior probability shift in the sense of Definition de:priorShift and that the random variable $Z$ is integrable both under $P$ and $Q$. Then it holds thatFor sets $S$, define the indicator fun If $Z$ is $X$-measurable, i.e. if there is a function $f:\mathcal{X}\to \mathbb{R}$ such that $Z =

Figures (1)

Figure 1: Asymptotic variances of maximum likelihood estimator, DeBias estimator and Friedman estimator in a binormal model. See Example \ref{['ex:binormal']} for the specification of the underlying model.

Theorems & Definitions (12)

definition thmcounterdefinition
theorem thmcountertheorem
proof
remark thmcounterremark
proposition thmcounterproposition
proof
remark thmcounterremark
corollary thmcountercorollary
corollary thmcountercorollary
remark thmcounterremark: DeBias method
...and 2 more

Comments on Friedman's Method for Class Distribution Estimation

TL;DR

Abstract

Comments on Friedman's Method for Class Distribution Estimation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (12)