Table of Contents
Fetching ...

Realizing Common Random Numbers: Event-Keyed Hashing for Causally Valid Stochastic Models

Vince Buffalo, Carl A. B. Pearson, Daniel Klein

Abstract

Agent-based models (ABMs) are widely used to estimate causal treatment effects via paired counterfactual simulation. A standard variance reduction technique is common random numbers (CRNs), which couples replicates across intervention scenarios by sharing the same random inputs. In practice, CRNs are implemented by reusing the same base seed, but this relies on a critical assumption: that the same draw index corresponds to the same modeled event across scenarios. Stateful pseudorandom number generators (PRNGs) violate this assumption whenever interventions alter the simulation's execution path, because any change in control flow shifts the draw index used for all downstream events. We argue that this execution-path-dependent draw indexing is not only a variance-reduction nuisance, but represents a fundamental mismatch between the scientific causal structure ABMs are intended to encode and the program-level causal structure induced by stateful PRNG implementations. Formalizing this through the lens of structural causal models (SCMs), we show that standard PRNG practices yield causally incoherent paired counterfactual comparisons even when the mechanistic specification is otherwise sound. We show that a remedy is to combine counter-based random number generators (e.g., Philox/Threefry) with event identifiers. This decouples random number generation from simulation execution order by making random draws explicit functions of the particular modeled event that called them, restoring the stable event-indexed exogenous structure assumed by SCMs.

Realizing Common Random Numbers: Event-Keyed Hashing for Causally Valid Stochastic Models

Abstract

Agent-based models (ABMs) are widely used to estimate causal treatment effects via paired counterfactual simulation. A standard variance reduction technique is common random numbers (CRNs), which couples replicates across intervention scenarios by sharing the same random inputs. In practice, CRNs are implemented by reusing the same base seed, but this relies on a critical assumption: that the same draw index corresponds to the same modeled event across scenarios. Stateful pseudorandom number generators (PRNGs) violate this assumption whenever interventions alter the simulation's execution path, because any change in control flow shifts the draw index used for all downstream events. We argue that this execution-path-dependent draw indexing is not only a variance-reduction nuisance, but represents a fundamental mismatch between the scientific causal structure ABMs are intended to encode and the program-level causal structure induced by stateful PRNG implementations. Formalizing this through the lens of structural causal models (SCMs), we show that standard PRNG practices yield causally incoherent paired counterfactual comparisons even when the mechanistic specification is otherwise sound. We show that a remedy is to combine counter-based random number generators (e.g., Philox/Threefry) with event identifiers. This decouples random number generation from simulation execution order by making random draws explicit functions of the particular modeled event that called them, restoring the stable event-indexed exogenous structure assumed by SCMs.
Paper Structure (18 sections, 2 theorems, 12 equations, 2 figures, 1 table)

This paper contains 18 sections, 2 theorems, 12 equations, 2 figures, 1 table.

Key Result

Proposition 1

Consider an ABM implementation using stateful PRNG with seed-matched initialization across intervention scenarios $a$ and $a'$. If an event $e$ occurs in both scenarios and the number of PRNG calls (within the relevant random stream, if multiple streams are used) before $e$ differs between scenarios

Figures (2)

  • Figure 1: Intended scientific causal structure. Each infection event has event-indexed exogenous noise ($U_1$, $U_2$) with stable identity across counterfactuals. The vaccination intervention $A$ affects only person 1's risk $p_1$. Downstream scheduling (onset_day) does not create causal paths to unrelated infections. Colors: gray = exogenous inputs/interventions; green = deterministic/bookkeeping/scheduling; blue = model outcomes (infections).
  • Figure 2: Actual program-level causal structure induced by stateful PRNG. The draw index $K_2$ for person 2 becomes endogenous, depending on whether person 1 was infected ($K_2 = 2 + I_1$). If person 1 gets infected, the draw_incubation call consumes $R_2$, forcing person 2 to use $R_3$; otherwise person 2 uses $R_2$. This creates a spurious causal pathway ($I_1 \to K_2 \to I_2$) absent from the scientific model, violating execution invariance. Colors: gray = exogenous inputs/interventions; green = deterministic/bookkeeping/scheduling; blue = model outcomes (infections).

Theorems & Definitions (4)

  • Definition 2.1: Execution Invariance
  • Proposition 1: Violation of Execution Invariance
  • proof : Proof sketch
  • Corollary 3.1