Table of Contents
Fetching ...

On the admissibility of bounds on the mean of discrete, scalar probability distributions from an iid sample

Erik Learned-Miller

TL;DR

This work studies lower bounds on the mean of a discrete distribution with known finite support from an IID sample by formulating valid bounds with controlled error $\alpha$. It introduces a central optimization framework that, conditioned on a fixed sample ordering, yields order-conditioned optimal bounds via the minimization over distributions that place sufficient mass on upper sets; the approach relies on the multinomial likelihood and the closed/open simplex structure. The paper proves a complete characterization of admissible bounds, showing that admissible bounds arise from conditionally-optimal, order-specific bounds, with injective bounds always admissible while non-injective bounds may be inadmissible if they exhibit breakable ties and admissible if ties are unbreakable. It further demonstrates that, for sample spaces with at least two outcomes and sample sizes at least two, there is no globally optimal bound that uniformly dominates all others, implying a landscape of multiple admissible but non-dominating bounds (at most $(N-1)!$ such bounds). These results unify existing specific bounds (e.g., trinomial bounds) within a broader admissibility framework and lay groundwork for practical computation and approximation of admissible bounds in finite-support settings.

Abstract

We address the problem of producing a lower bound for the mean of a discrete probability distribution, with known support over a finite set of real numbers, from an iid sample of that distribution. Up to a constant, this is equivalent to bounding the mean of a multinomial distribution (with known support) from a sample of that distribution. Our main contribution is to characterize the complete set of admissible bound functions for any sample space, and to show that certain previously published bounds are admissible. We prove that the solution to each one of a set of simple-to-state optimization problems yields such an admissible bound. Single examples of such bounds, such as the trinomial bound by Miratrix and Stark [2009] have been previously published, but without an analysis of admissibility, and without a discussion of the full set of alternative admissible bounds. In addition to a variety of results about admissible bounds, we prove the non-existence of optimal bounds for sample spaces with supports of size greater than 1 and samples sizes greater than 1.

On the admissibility of bounds on the mean of discrete, scalar probability distributions from an iid sample

TL;DR

This work studies lower bounds on the mean of a discrete distribution with known finite support from an IID sample by formulating valid bounds with controlled error . It introduces a central optimization framework that, conditioned on a fixed sample ordering, yields order-conditioned optimal bounds via the minimization over distributions that place sufficient mass on upper sets; the approach relies on the multinomial likelihood and the closed/open simplex structure. The paper proves a complete characterization of admissible bounds, showing that admissible bounds arise from conditionally-optimal, order-specific bounds, with injective bounds always admissible while non-injective bounds may be inadmissible if they exhibit breakable ties and admissible if ties are unbreakable. It further demonstrates that, for sample spaces with at least two outcomes and sample sizes at least two, there is no globally optimal bound that uniformly dominates all others, implying a landscape of multiple admissible but non-dominating bounds (at most such bounds). These results unify existing specific bounds (e.g., trinomial bounds) within a broader admissibility framework and lay groundwork for practical computation and approximation of admissible bounds in finite-support settings.

Abstract

We address the problem of producing a lower bound for the mean of a discrete probability distribution, with known support over a finite set of real numbers, from an iid sample of that distribution. Up to a constant, this is equivalent to bounding the mean of a multinomial distribution (with known support) from a sample of that distribution. Our main contribution is to characterize the complete set of admissible bound functions for any sample space, and to show that certain previously published bounds are admissible. We prove that the solution to each one of a set of simple-to-state optimization problems yields such an admissible bound. Single examples of such bounds, such as the trinomial bound by Miratrix and Stark [2009] have been previously published, but without an analysis of admissibility, and without a discussion of the full set of alternative admissible bounds. In addition to a variety of results about admissible bounds, we prove the non-existence of optimal bounds for sample spaces with supports of size greater than 1 and samples sizes greater than 1.

Paper Structure

This paper contains 28 sections, 19 theorems, 39 equations, 5 figures.

Key Result

Lemma 1.3

Let $\Omega$ be a sample space with $N$ elements and $1-\alpha$ a confidence level. Let $B$ be an injective bound over $\Omega$. Then for all distributions $\mathcal{F}$ over $\Omega$, there are exactly $N+1$ possible error sets for $B$.

Figures (5)

  • Figure 1: Left. The multinomial likelihood function $L(\mathbf{p}|\mathbf{x})=Prob(\mathbf{x}|\mathbf{p})$ for $\mathbf{x}=(0,0,1,3)$. Each point in the simplex gives the probability of obtaining that sample as an iid sample of size 4 from the corresponding probability distribution. The scale of the probabilities is shown on the right. Starting from the top of the triangle, and going clockwise, the three distributions at the corners of the triangles represent the distributions with all of their mass on the outcomes of 3, 1, and 0 respectively. Right. The set of likelihood functions for each of the 15 samples in the discrete simplex for the sample space $\Omega$ of Example \ref{['ex:omega']}.
  • Figure 2: Top. Various subsets, indicated by the yellow circles, of the full sample space. Bottom. The multinomial likelihoods of the subsets on the top. The black lines illustrate a particular isocontour of the probability function, which is relevant to the central optimization problem discussed below. Notice that each subset likelihood is a sum of some subset of the sample likelihood functions shown on the right side of Figure \ref{['fig:multinomial_likelihood']}.
  • Figure 3: The top row of this figure shows three subsets (as indicated by the yellow dots) of a sample space $\Omega$ over a support set of size $3$ and a sample size of $n=3$. Below each subset, the multinomial set likelihood is shown. In each case, the red contour (if it is present) shows the the isocontour for the multinomial likelihood where the probability of the subset in the top row is approximately equal to $\alpha=0.33$. The example on the left shows how the set set $\mathcal{G}_k$ of distributions with likelihoods greater than $\alpha$ forms an open set in this case. In the middle, since all distributions have probability greater than $\alpha$, the set $\mathcal{G}_k$ represents the entire simplex, and is hence closed. On the right, the set $\mathcal{G}_k$ is semi-open, having an open boundary on the bottom and closed boundaries on the top.
  • Figure 4: Left. The left figure plots a polynomial function over the simplex corresponding to the probability of the subset $\{(1,1,3),(1,3,3),(3,3,3)\}$ of a sample space over the support $\{0,1,3\}$. The red line shows the set of distributions for which the likelihoods of the set are $\alpha$ (here, $\alpha = 0.35)$), and the white dotted line shows a set of distributions with the same mean (an isomean contour). The red dot shows the distribution in the simplex with minimum mean, subject to the constraint that the probability of the subset is greater than or equal to $\alpha$. Right. The right figure is similar but shows the same plot for a slightly larger subset of samples: $\{(0,1,3),(1,1,3),(1,3,3),(3,3,3)\}$. Note that the point in the left figure under which the probability was $\alpha$ now yields a substantially higher probability, and no longer represents an optimum. The optimal mean has moved to the mean of the yellow point's distribution.
  • Figure 5: Left. This figure illustrates several ideas. Each plot shows a multinomial likelihood for a different sample space subset over the same simplex (the set of probability distributions over a fixed support set). In each plot, the red curve shows the isocountour with probability equal to $\alpha$. The white dotted lines show isocontours of the mean for the minimum mean value satisfying the "probability-equals-$\alpha$" constraint. The plot on the left illustrates the central optimization problem for an error set that is the union of a set of samples $X$ augmented by another single sample $\mathbf{x}_A$. The distribution that achieves the minimum mean is shown with the red dot. Note that this optimum distribution is on the boundary of the simplex, and hence is not in the open simplex. The rightmost figure shows the optimization problem with both $\mathbf{x}_A$ and $\mathbf{x}_B$ added to $X$. Unlike cases in which an optimum with a smaller subset occurs in the open simplex, we see that the optimal distribution is still in the same place (red dot again). Hence, for this support set and this ordering of samples ($B(\mathbf{x}_A) \leq B(\mathbf{x}_B))$, the bound has ties and hence is not injective. Note that if we had instead chosen a total order such that $B(\mathbf{x}_B) \leq B(\mathbf{x}_A)$, the bound for $\mathbf{x}_B$ would be strictly better, as shown by the yellow dot in the central figure, and the bound for $\mathbf{x}_A$ would be left unchanged (red dot). Hence, this ordering would yield a strictly better bound (for the samples $\mathbf{x}_A$ and $\mathbf{x}_B$). This also implies that the bound with the original ordering is inadmissible.

Theorems & Definitions (67)

  • Definition 1.1: sample space
  • Example 1.1
  • Example 1.2
  • Definition 1.2: multinomial likelihood function
  • Example 1.3
  • Definition 1.3: multinomial likelihood function for sets
  • Definition 1.4: lower bound with specified support
  • Definition 1.5: bound correctness for a particular sample and distribution
  • Definition 1.6: error set and valid set of a bound for a distribution and sample size
  • Definition 1.7: validity of a bound for a distribution, sample size, and confidence level
  • ...and 57 more