Table of Contents
Fetching ...

Algorithms for Generating Small Random Samples

Vincent A. Cicirello

TL;DR

This paper addresses the problem of efficiently generating small random samples without replacement, focusing on the special cases $k=2$ and $k=3$. It introduces constant-time algorithms, RandomPair$(n)$ and RandomTriple$(n)$, that use two and three random numbers and resolve duplicates to ensure uniformity, with potential extensions to fixed larger $k$ via sampling networks. The authors benchmark these methods against standard general-purpose algorithms (reservoir, pool, and insertion sampling) using the $\rho\mu$ Java library and JMH, demonstrating substantial speedups for small $k$ and offering insights into hardware-friendly implementations. The work provides practical, open-source implementations and reproducible experiments, highlighting the impact on fast sampling tasks in areas like evolutionary algorithms and permutation-related computations.

Abstract

We present algorithms for generating small random samples without replacement. We consider two cases. We present an algorithm for sampling a pair of distinct integers, and an algorithm for sampling a triple of distinct integers. The worst-case runtime of both algorithms is constant, while the worst-case runtimes of common algorithms for the general case of sampling $k$ elements from a set of $n$ increase with $n$. Java implementations of both algorithms are included in the open source library $ρμ$.

Algorithms for Generating Small Random Samples

TL;DR

This paper addresses the problem of efficiently generating small random samples without replacement, focusing on the special cases and . It introduces constant-time algorithms, RandomPair and RandomTriple, that use two and three random numbers and resolve duplicates to ensure uniformity, with potential extensions to fixed larger via sampling networks. The authors benchmark these methods against standard general-purpose algorithms (reservoir, pool, and insertion sampling) using the Java library and JMH, demonstrating substantial speedups for small and offering insights into hardware-friendly implementations. The work provides practical, open-source implementations and reproducible experiments, highlighting the impact on fast sampling tasks in areas like evolutionary algorithms and permutation-related computations.

Abstract

We present algorithms for generating small random samples without replacement. We consider two cases. We present an algorithm for sampling a pair of distinct integers, and an algorithm for sampling a triple of distinct integers. The worst-case runtime of both algorithms is constant, while the worst-case runtimes of common algorithms for the general case of sampling elements from a set of increase with . Java implementations of both algorithms are included in the open source library .
Paper Structure (9 sections, 6 tables, 7 algorithms)