Efficient Rejection Sampling in the Entropy-Optimal Range

Thomas L. Draper; Feras A. Saad

Efficient Rejection Sampling in the Entropy-Optimal Range

Thomas L. Draper, Feras A. Saad

TL;DR

This paper addresses exact sampling from finite discrete distributions using unbiased coin flips by introducing the Amplified Loaded Dice Roller (ALDR), a family of rejection samplers that blends Knuth–Yao entropy-optimal ideas with rejection sampling. ALDR achieves a near-entropy-optimal cost, with an expected entropy cost bound of $H(P) \le \mathbb{E}[\mathscr{C}(ALDR)] < H(P) + 2$ for depth parameter $K \ge 2k$, while preserving linearithmic space $O(n\log m\log n)$ and practical preprocessing. The work also provides a detailed analysis of ALDR’s toll, tight bounds for $K=2k$, and comparisons to FLDR and the Alias method, supplemented by implementable integer-arithmetic algorithms and numerical results showing runtime and entropy improvements. Overall, the results offer a scalable, exact sampling approach with favorable entropy efficiency suitable for hardware-constrained or entropy-sensitive environments, advancing discrete-distribution sampling beyond the Alias method. The paper also identifies limitations, such as potential gaps where ALDR is not entropy-optimal and questions around optimal amplification strategies.

Abstract

The problem of generating a random variate $X$ from a finite discrete probability distribution $P$ using an entropy source of independent unbiased coin flips is considered. The Knuth and Yao complexity theory of nonuniform random number generation furnishes a family of "entropy-optimal" sampling algorithms that consume between $H(P)$ and $H(P)+2$ coin flips per generated output, where $H$ is the Shannon entropy function. However, the space complexity of entropy-optimal samplers scales exponentially with the number of bits required to encode $P$. This article introduces a family of efficient rejection samplers and characterizes their entropy, space, and time complexity. Within this family is a distinguished sampling algorithm that requires linearithmic space and preprocessing time, and whose expected entropy cost always falls in the entropy-optimal range $[H(P), H(P)+2)$. No previous sampler for discrete probability distributions is known to achieve these characteristics. Numerical experiments demonstrate performance improvements in runtime and entropy of the proposed algorithm compared to the celebrated alias method.

Efficient Rejection Sampling in the Entropy-Optimal Range

TL;DR

for depth parameter

, while preserving linearithmic space

and practical preprocessing. The work also provides a detailed analysis of ALDR’s toll, tight bounds for

, and comparisons to FLDR and the Alias method, supplemented by implementable integer-arithmetic algorithms and numerical results showing runtime and entropy improvements. Overall, the results offer a scalable, exact sampling approach with favorable entropy efficiency suitable for hardware-constrained or entropy-sensitive environments, advancing discrete-distribution sampling beyond the Alias method. The paper also identifies limitations, such as potential gaps where ALDR is not entropy-optimal and questions around optimal amplification strategies.

Abstract

The problem of generating a random variate

from a finite discrete probability distribution

using an entropy source of independent unbiased coin flips is considered. The Knuth and Yao complexity theory of nonuniform random number generation furnishes a family of "entropy-optimal" sampling algorithms that consume between

and

coin flips per generated output, where

is the Shannon entropy function. However, the space complexity of entropy-optimal samplers scales exponentially with the number of bits required to encode

. This article introduces a family of efficient rejection samplers and characterizes their entropy, space, and time complexity. Within this family is a distinguished sampling algorithm that requires linearithmic space and preprocessing time, and whose expected entropy cost always falls in the entropy-optimal range

. No previous sampler for discrete probability distributions is known to achieve these characteristics. Numerical experiments demonstrate performance improvements in runtime and entropy of the proposed algorithm compared to the celebrated alias method.

Efficient Rejection Sampling in the Entropy-Optimal Range

TL;DR

Abstract

Efficient Rejection Sampling in the Entropy-Optimal Range

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (64)