Table of Contents
Fetching ...

The Harmonic Entropy Estimator: Minimax Optimality and Semiparametric Efficiency for Infinite Alphabets

Octavio César Mesner

TL;DR

This work tackles Shannon entropy estimation for discrete distributions with countably infinite support by introducing the harmonic entropy estimator, built on exact identities that connect harmonic-transformed binomial counts to log-probabilities. It proves a sharp $L_2$ minimax rate of $1/n$ for tail decays $p_j \lesssim j^{-2}$, extending finite-support results to infinite alphabets, and shows semiparametric efficiency under the stronger tail condition $p_j = o(j^{-2})$, with $\sqrt{n}(\hat H - H) \Rightarrow N(0, \mathrm{Var}[\log p(X)])$. This combination yields a simple, one-step estimator with precise bias and variance characterizations and establishes the sharp statistical limits for entropy estimation over broad tail classes. The results unify finite-variance and certain monotone tail distributions, and offer a solid foundation for practical inference and potential extensions to continuous or mixed settings.

Abstract

This paper considers the estimation of Shannon entropy for discrete distributions with countably infinite support. While minimax rates for finite-support distributions are established, infinite-support distributions present distinct challenges regarding bias control as probabilities vanish. We address this by introducing the \textit{harmonic entropy estimator}, a statistic derived from an exact algebraic identity relating the expectation of harmonic-transformed binomial counts to the logarithm of underlying success probabilities. We establish two main results characterizing the statistical limits of this problem. First, for the class of distributions with at least quadratically decaying tails ($p_j\lesssim j^{-2}$), we prove that the estimator achieves the parametric $L_2$-minimax convergence rate of order $1/n$. Second, under the stronger condition $p_j =o(j^{-2})$, we demonstrate that the estimator is semiparametrically efficient, converging to a normal distribution with variance matching the asymptotic efficiency bound $\textrm{Var}[\log p(X)]$. These results unify entropy estimation theory for finite-variance distributions, and provide a simple, one-step estimator with sharp theoretical guarantees.

The Harmonic Entropy Estimator: Minimax Optimality and Semiparametric Efficiency for Infinite Alphabets

TL;DR

This work tackles Shannon entropy estimation for discrete distributions with countably infinite support by introducing the harmonic entropy estimator, built on exact identities that connect harmonic-transformed binomial counts to log-probabilities. It proves a sharp minimax rate of for tail decays , extending finite-support results to infinite alphabets, and shows semiparametric efficiency under the stronger tail condition , with . This combination yields a simple, one-step estimator with precise bias and variance characterizations and establishes the sharp statistical limits for entropy estimation over broad tail classes. The results unify finite-variance and certain monotone tail distributions, and offer a solid foundation for practical inference and potential extensions to continuous or mixed settings.

Abstract

This paper considers the estimation of Shannon entropy for discrete distributions with countably infinite support. While minimax rates for finite-support distributions are established, infinite-support distributions present distinct challenges regarding bias control as probabilities vanish. We address this by introducing the \textit{harmonic entropy estimator}, a statistic derived from an exact algebraic identity relating the expectation of harmonic-transformed binomial counts to the logarithm of underlying success probabilities. We establish two main results characterizing the statistical limits of this problem. First, for the class of distributions with at least quadratically decaying tails (), we prove that the estimator achieves the parametric -minimax convergence rate of order . Second, under the stronger condition , we demonstrate that the estimator is semiparametrically efficient, converging to a normal distribution with variance matching the asymptotic efficiency bound . These results unify entropy estimation theory for finite-variance distributions, and provide a simple, one-step estimator with sharp theoretical guarantees.

Paper Structure

This paper contains 14 sections, 20 theorems, 110 equations, 4 figures.

Key Result

Theorem 1.1

Let $\mathcal{H}$ be the set of entropy estimators and $\mathcal{P} := \{p: p_j \lesssim j^{-2}\}$ be the class of distributions with quadratically decaying tails. Then as $n\rightarrow\infty$

Figures (4)

  • Figure 1: The figure above shows plots for a Multinomial distribution with probability vector $(0.5, 0.2, 0.15, 0.1, 0.05)$. On the left are average mean squared errors (MSE) by sample size and on the right are corresponding violin plots.
  • Figure 2: The figure above shows plots for the uniform distribution with $S=500$. On the left are average mean squared errors (MSE) by sample size and on the right are corresponding violin plots.
  • Figure 3: The figure above shows plots for the geometric distribution with parameter $p=0.1$. On the left are average mean squared errors (MSE) by sample size and on the right are corresponding violin plots.
  • Figure 4: The figure above shows plots for the zeta distribution with parameter $\gamma=2$. On the left are average mean squared errors (MSE) by sample size and on the right are corresponding violin plots.

Theorems & Definitions (44)

  • Theorem 1.1: $L_2$-Minimax Risk
  • Theorem 1.2: Asymptotic Normality
  • Theorem 2.1: $L_2$-Risk Upper Bound
  • Proposition 2.2
  • proof : Proof Sketch
  • Proposition 2.3
  • proof : Proof Sketch
  • Theorem 2.4
  • proof : Proof Sketch
  • Proposition 3.1: $L_2$-Risk Lower Bound
  • ...and 34 more