Table of Contents
Fetching ...

Distribution Estimation under the Infinity Norm

Aryeh Kontorovich, Amichai Painsky

TL;DR

Novel bounds for estimating discrete probability distributions under the $\ell_\infty$ norm are presented and data-dependent convergence guarantees for the maximum likelihood estimator significantly improve upon the currently known results.

Abstract

We present novel bounds for estimating discrete probability distributions under the $\ell_\infty$ norm. These are nearly optimal in various precise senses, including a kind of instance-optimality. Our data-dependent convergence guarantees for the maximum likelihood estimator significantly improve upon the currently known results. A variety of techniques are utilized and innovated upon, including Chernoff-type inequalities and empirical Bernstein bounds. We illustrate our results in synthetic and real-world experiments. Finally, we apply our proposed framework to a basic selective inference problem, where we estimate the most frequent probabilities in a sample.

Distribution Estimation under the Infinity Norm

TL;DR

Novel bounds for estimating discrete probability distributions under the norm are presented and data-dependent convergence guarantees for the maximum likelihood estimator significantly improve upon the currently known results.

Abstract

We present novel bounds for estimating discrete probability distributions under the norm. These are nearly optimal in various precise senses, including a kind of instance-optimality. Our data-dependent convergence guarantees for the maximum likelihood estimator significantly improve upon the currently known results. A variety of techniques are utilized and innovated upon, including Chernoff-type inequalities and empirical Bernstein bounds. We illustrate our results in synthetic and real-world experiments. Finally, we apply our proposed framework to a basic selective inference problem, where we estimate the most frequent probabilities in a sample.
Paper Structure (23 sections, 20 theorems, 117 equations, 3 figures)

This paper contains 23 sections, 20 theorems, 117 equations, 3 figures.

Key Result

Theorem 1

Let $p=p_{i\in \mathbb{N}}$ be a distribution over $\mathbb{N}$. Let $X^n$ be a sample of $n$ independent observations from $p$. Let $\hat{p}(X^n)$ be the MLE of $p$. Then, with probability $1-\delta$, for every even $m>0$.

Figures (3)

  • Figure 1: The proposed bounds compared to the benchmark and to an Oracle, as $n$ grows and $\delta=0.05$
  • Figure 2: The proposed bounds compared to the benchmark and to an Oracle, as $n$ grows and $\delta=1/n^2$
  • Figure 3: The proposed bounds compared to the benchmark and to an Oracle, as $n$ grows. The lower bound for the most frequent symbol corresponds to Theorem \ref{['selective_inference_LB']}

Theorems & Definitions (35)

  • Theorem 1
  • Theorem 2
  • Corollary 2.1
  • Theorem 3
  • Remark 4.1
  • Theorem 4
  • Remark 4.2
  • Proposition 5
  • Proposition 6
  • Remark 4.3
  • ...and 25 more