Table of Contents
Fetching ...

On Semi-supervised Estimation of Discrete Distributions under f-divergences

Hasan Sabri Melihcan Erol, Lizhong Zheng

TL;DR

The paper addresses semi-supervised estimation of the joint distribution $p_{XY}$ from mixed labeled and unlabeled data under minimax risk. It shows that composing univariate minimax estimators preserves optimal first-order risk for $1 \le p \le 2$ in $l^p_p$ losses and extends these results to a broad family of $f$-divergences, including KL, chi-square, Squared Hellinger, and Le Cam. The authors derive explicit rates and constants, such as $R^p_{m,n} = (|\mathcal X|)^{1- p/2} C_p m^{-p/2}$ and $R^f_{n,m} = |\mathcal X| C_f / m$, and prove minimax optimality of the composition estimators in the semi-supervised setting. These results provide rigorous guarantees for discrete pmf estimation when unlabeled data are abundant and labeling is costly, across multiple divergence criteria. Overall, the work advances theoretical understanding of semi-supervised minimax estimation for discrete distributions.

Abstract

We study the problem of estimating the joint probability mass function (pmf) over two random variables. In particular, the estimation is based on the observation of $m$ samples containing both variables and $n$ samples missing one fixed variable. We adopt the minimax framework with $l^p_p$ loss functions. Recent work established that univariate minimax estimator combinations achieve minimax risk with the optimal first-order constant for $p \ge 2$ in the regime $m = o(n)$, questions remained for $p \le 2$ and various $f$-divergences. In our study, we affirm that these composite estimators are indeed minimax optimal for $l^p_p$ loss functions, specifically for the range $1 \le p \le 2$, including the critical $l_1$ loss. Additionally, we ascertain their optimality for a suite of $f$-divergences, such as KL, $χ^2$, Squared Hellinger, and Le Cam divergences.

On Semi-supervised Estimation of Discrete Distributions under f-divergences

TL;DR

The paper addresses semi-supervised estimation of the joint distribution from mixed labeled and unlabeled data under minimax risk. It shows that composing univariate minimax estimators preserves optimal first-order risk for in losses and extends these results to a broad family of -divergences, including KL, chi-square, Squared Hellinger, and Le Cam. The authors derive explicit rates and constants, such as and , and prove minimax optimality of the composition estimators in the semi-supervised setting. These results provide rigorous guarantees for discrete pmf estimation when unlabeled data are abundant and labeling is costly, across multiple divergence criteria. Overall, the work advances theoretical understanding of semi-supervised minimax estimation for discrete distributions.

Abstract

We study the problem of estimating the joint probability mass function (pmf) over two random variables. In particular, the estimation is based on the observation of samples containing both variables and samples missing one fixed variable. We adopt the minimax framework with loss functions. Recent work established that univariate minimax estimator combinations achieve minimax risk with the optimal first-order constant for in the regime , questions remained for and various -divergences. In our study, we affirm that these composite estimators are indeed minimax optimal for loss functions, specifically for the range , including the critical loss. Additionally, we ascertain their optimality for a suite of -divergences, such as KL, , Squared Hellinger, and Le Cam divergences.
Paper Structure (17 sections, 13 theorems, 26 equations)

This paper contains 17 sections, 13 theorems, 26 equations.

Key Result

Theorem 1

Let $\hat{q}^*_{n}$ be a minimax optimal estimator for $r^p_n$. Then the conditional composition $\hat{q}^{*,m}_{Y\mid X}$ based on $\hat{q}^{*}_n$ is minimax optimal for $\bar{R}^p_m$:

Theorems & Definitions (13)

  • Theorem 1: Theorem 1 of onsemsup
  • Theorem 2
  • Theorem 3: Theorem 3 of onsemsup
  • Theorem 4
  • Theorem 5
  • Corollary 1
  • Theorem 6
  • Theorem 7
  • Lemma 1
  • Lemma 2
  • ...and 3 more