Table of Contents
Fetching ...

Concentration inequalities for the sum in sampling without replacement: an approach via majorization

Jianhang Ai, Ondřej Kuželka, Christos Pelekis

TL;DR

This paper derives non-asymptotic tail bounds for the sum $X_P$ of a $k$-subset drawn without replacement from a zero-sum population $P=(x_1,\dots,x_n)$. It develops a majorization-based framework, leveraging Schur-convexity to reduce the analysis to extremal majorization configurations and expressing key quantities in terms of hypergeometric random variables. The authors obtain a universal lower bound on $\mathbb{P}(X_P>0)$ of order $\frac{k}{n}\sqrt{\frac{n-k}{nk}}$, and an upper bound for $\mathbb{P}(X_P\ge t)$ that depends on the population’s absolute deviation $\alpha=\tfrac{1}{2}\sum_i |x_i|$, thereby providing a complementary alternative to classical Hoeffding bounds. The results illuminate the role of majorization in concentration phenomena for sampling without replacement and connect to MMS-type questions in a non-asymptotic regime, with practical implications when only coarse population information is available.

Abstract

Let $P=(x_1,\ldots,x_n)$ be a population consisting of $n\ge 2$ real numbers whose sum is zero, and let $k <n$ be a positive integer. We sample $k$ elements from $P$ without replacement and denote by $X_P$ the sum of the elements in our sample. In this article, using ideas from the theory of majorization, we deduce non-asymptotic lower and upper bounds on the probability that $X_P$ exceeds its expected value.

Concentration inequalities for the sum in sampling without replacement: an approach via majorization

TL;DR

This paper derives non-asymptotic tail bounds for the sum of a -subset drawn without replacement from a zero-sum population . It develops a majorization-based framework, leveraging Schur-convexity to reduce the analysis to extremal majorization configurations and expressing key quantities in terms of hypergeometric random variables. The authors obtain a universal lower bound on of order , and an upper bound for that depends on the population’s absolute deviation , thereby providing a complementary alternative to classical Hoeffding bounds. The results illuminate the role of majorization in concentration phenomena for sampling without replacement and connect to MMS-type questions in a non-asymptotic regime, with practical implications when only coarse population information is available.

Abstract

Let be a population consisting of real numbers whose sum is zero, and let be a positive integer. We sample elements from without replacement and denote by the sum of the elements in our sample. In this article, using ideas from the theory of majorization, we deduce non-asymptotic lower and upper bounds on the probability that exceeds its expected value.

Paper Structure

This paper contains 8 sections, 17 theorems, 75 equations, 1 figure.

Key Result

Theorem 1

Let $P=(x_1,\ldots,x_n)$ be a population of size $n\ge 2$ such that $\sum_{i=1}^{n} x_i=0$. Let $X_P$ denote the sum of $k\in [n-1]$ elements that are sampled without replacement from $P$. If $a = \min_{i \in [n]} x_i$ and $b = \max_{i\in [n]} x_i$, then

Figures (1)

  • Figure 1: Comparison of the bounds for different values of $\varepsilon$

Theorems & Definitions (29)

  • Theorem 1: Hoeffding Hoeffding
  • Theorem 2: Pokrovskiy Pokrovskiy
  • Theorem 3
  • Corollary 1
  • Theorem 4
  • Lemma 1: Folklore
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • ...and 19 more