Concentration inequalities for the sum in sampling without replacement: an approach via majorization
Jianhang Ai, Ondřej Kuželka, Christos Pelekis
TL;DR
This paper derives non-asymptotic tail bounds for the sum $X_P$ of a $k$-subset drawn without replacement from a zero-sum population $P=(x_1,\dots,x_n)$. It develops a majorization-based framework, leveraging Schur-convexity to reduce the analysis to extremal majorization configurations and expressing key quantities in terms of hypergeometric random variables. The authors obtain a universal lower bound on $\mathbb{P}(X_P>0)$ of order $\frac{k}{n}\sqrt{\frac{n-k}{nk}}$, and an upper bound for $\mathbb{P}(X_P\ge t)$ that depends on the population’s absolute deviation $\alpha=\tfrac{1}{2}\sum_i |x_i|$, thereby providing a complementary alternative to classical Hoeffding bounds. The results illuminate the role of majorization in concentration phenomena for sampling without replacement and connect to MMS-type questions in a non-asymptotic regime, with practical implications when only coarse population information is available.
Abstract
Let $P=(x_1,\ldots,x_n)$ be a population consisting of $n\ge 2$ real numbers whose sum is zero, and let $k <n$ be a positive integer. We sample $k$ elements from $P$ without replacement and denote by $X_P$ the sum of the elements in our sample. In this article, using ideas from the theory of majorization, we deduce non-asymptotic lower and upper bounds on the probability that $X_P$ exceeds its expected value.
