Batched Ranged Random Integer Generation

Nevin Brackett-Rozinsky; Daniel Lemire

Batched Ranged Random Integer Generation

Nevin Brackett-Rozinsky, Daniel Lemire

TL;DR

The paper tackles efficiently converting $L$-bit random words into multiple unbiased bounded integers by extending Lemire's division-light method to batched outputs using mixed-radix digits. The key idea is to iteratively multiply by bases $b_i$ and extract digits as $L$-bit chunks, ensuring uniformity when a final threshold condition is met, and to interpret the digits as a mixed-radix number yielding a uniform final value. This enables generating several random bounds per word with zero divisions in the common case and reduces random-number generator calls, which translates into substantial speedups for batched shuffling tasks like Fisher-Yates. Empirically, the method yields 1.5–4.5x speedups depending on hardware and RNG cost, and instruction counts confirm fewer operations per element due to fewer RNG invocations and divisions. The work suggests broad applicability to sampling and simulations and provides C implementations and methodology for hardware-aware optimization.

Abstract

Pseudorandom values are often generated as 64-bit binary words. These random words need to be converted into ranged values without statistical bias. We present an efficient algorithm to generate multiple independent uniformly-random bounded integers from a single uniformly-random binary word, without any bias. In the common case, our method uses one multiplication and no division operations per value produced. In practice, our algorithm can more than double the speed of unbiased random shuffling for small to moderately large arrays.

Batched Ranged Random Integer Generation

TL;DR

The paper tackles efficiently converting

-bit random words into multiple unbiased bounded integers by extending Lemire's division-light method to batched outputs using mixed-radix digits. The key idea is to iteratively multiply by bases

and extract digits as

-bit chunks, ensuring uniformity when a final threshold condition is met, and to interpret the digits as a mixed-radix number yielding a uniform final value. This enables generating several random bounds per word with zero divisions in the common case and reduces random-number generator calls, which translates into substantial speedups for batched shuffling tasks like Fisher-Yates. Empirically, the method yields 1.5–4.5x speedups depending on hardware and RNG cost, and instruction counts confirm fewer operations per element due to fewer RNG invocations and divisions. The work suggests broad applicability to sampling and simulations and provides C implementations and methodology for hardware-aware optimization.

Abstract

Paper Structure (14 sections, 3 theorems, 3 equations, 5 figures, 4 tables, 4 algorithms)

This paper contains 14 sections, 3 theorems, 3 equations, 5 figures, 4 tables, 4 algorithms.

Introduction
Mathematical notation
Existing algorithms
Batched dice rolls
Mixed-radix numbers
Main result
Worked example
Implementation
Shuffling arrays
Batch sizes
Experiments
Instruction counts
Conclusion
Code samples

Key Result

lemma 1

Lemire's method produces uniformly random integers in the range $[0, b)$.

Figures (5)

Figure 1: Shuffle timings with Lehmer random number generator
Figure 2: Shuffle timings with PCG random number generator
Figure 3: Shuffle timings with ChaCha random number generator
Figure 4: Speed ratios between shuffle_6 and a conventional unbatched shuffle
Figure 5: Instructions retired per element for arrays of 16384.0 64-bit elements

Theorems & Definitions (6)

lemma 1
proof
lemma 2
proof
theorem 1
proof

Batched Ranged Random Integer Generation

TL;DR

Abstract

Batched Ranged Random Integer Generation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (6)