Quasi-Monte Carlo with one categorical variable
Valerie N. P. Ho, Art B. Owen, Zexin Pan
TL;DR
This work advances randomized quasi-Monte Carlo (RQMC) for integrals with a single categorical input by modeling the problem as a mixture across L strata with weights $\alpha_ℓ$. It develops principled, rate-aware stratum allocations that oversample smaller mixture components when within-stratum errors decay at an RQMC rate, and it shows how dyadic (power-of-two) sample splits yield optimal scrambled Sobol' performance. The paper provides a minimax justification for near-equal allocations, a forward allocation algorithm under dyadic constraints, and demonstrations on a toy example and a Saint-Venant flood model that oversampling improves variance. In practice, these results guide how to design RQMC sampling for mixture and importance-sampling problems, particularly when convergence rates are faster than Monte Carlo. All mathematical notation is conveyed with explicit $...$ delimiters.
Abstract
We study randomized quasi-Monte Carlo (RQMC) estimation of a multivariate integral where one of the variables takes only a finite number of values. This problem arises when the variable of integration is drawn from a mixture distribution as is common in importance sampling and also arises in some recent work on transport maps. We find that when integration error decreases at an RQMC rate that it is then important to oversample the smallest mixture components instead of using a proportional allocation. This can even improve the rate of convergence. The optimal allocations depend on the possibly unknown convergence rate. Designing the sample with an incorrect assumption on the rate still attains that convergence rate, with an inferior implied constant. The penalty for using a pessimistic rate is typically higher than for using an optimistic one. We also find that for the most accurate RQMC sampling methods, it is advantageous to arrange that our $n=2^m$ randomized Sobol' points split into subsample sizes that are also powers of $2$.
