Table of Contents
Fetching ...

Predicting the statistical error of analog particle tracing Monte Carlo

Vince Maes, Ignace Bossuyt, Hannes Vandecasteele, Wouter Dekeyser, Julian Koellermeier, Martine Baelmans, Giovanni Samaey

TL;DR

This paper develops a priori, theory-based variance predictors for QoIs estimated on histograms in analog particle tracing Monte Carlo for high-dimensional linear kinetic equations. By interpreting histogram bin visits as a (correlated) binomial experiment, the authors derive upper bounds and exact/approximate variance expressions under independent, Markov, and hidden Markov processes, and then specialize these to QoI estimators (point and analog). The main contributions are: (i) a rigorous framework connecting particle dynamics to correlated binomial variance, (ii) cheap Markov-based variance predictors that accurately capture variance magnitude for analog estimators, (iii) a detailed treatment of sources, sinks, and stationary sources, and (iv) numerical validation in a 1D neutral-plasma kinetic model with open-source code. The results enable a priori determination of particle numbers and informed method selection, with extensions to higher dimensions and potential applications to hybrid and nonlinear Monte Carlo schemes in plasma and related kinetic problems.

Abstract

Large particle systems are often described by high-dimensional (linear) kinetic equations that are simulated using Monte Carlo methods for which the asymptotic convergence rate is independent of the dimensionality. Even though the asymptotic convergence rate is known, predicting the actual value of the statistical error remains a challenging problem. In this paper, we show how the statistical error of an analog particle tracing Monte Carlo method can be calculated (expensive) and predicted a priori (cheap) when estimating quantities of interest (QoI) on a histogram. We consider two types of QoI estimators: point estimators for which each particle provides one independent contribution to the QoI estimates, and analog estimators for which each particle provides multiple correlated contributions to the QoI estimates. The developed statistical error predictors can be applied to other QoI estimators and nonanalog simulation routines as well. The error analysis is based on interpreting the number of particle visits to a histogram bin as the result of a (correlated) binomial experiment. The resulting expressions can be used to optimize (non)analog particle tracing Monte Carlo methods and hybrid simulation methods involving a Monte Carlo component, as well as to select an optimal particle tracing Monte Carlo method from several available options. Additionally, the cheap statistical error predictors can be used to determine a priori the number of particles N that is needed to reach a desired accuracy. We illustrate the theory using a linear kinetic equation describing neutral particles in the plasma edge of a fusion device and show numerical results. The code used to perform the numerical experiments is openly available.

Predicting the statistical error of analog particle tracing Monte Carlo

TL;DR

This paper develops a priori, theory-based variance predictors for QoIs estimated on histograms in analog particle tracing Monte Carlo for high-dimensional linear kinetic equations. By interpreting histogram bin visits as a (correlated) binomial experiment, the authors derive upper bounds and exact/approximate variance expressions under independent, Markov, and hidden Markov processes, and then specialize these to QoI estimators (point and analog). The main contributions are: (i) a rigorous framework connecting particle dynamics to correlated binomial variance, (ii) cheap Markov-based variance predictors that accurately capture variance magnitude for analog estimators, (iii) a detailed treatment of sources, sinks, and stationary sources, and (iv) numerical validation in a 1D neutral-plasma kinetic model with open-source code. The results enable a priori determination of particle numbers and informed method selection, with extensions to higher dimensions and potential applications to hybrid and nonlinear Monte Carlo schemes in plasma and related kinetic problems.

Abstract

Large particle systems are often described by high-dimensional (linear) kinetic equations that are simulated using Monte Carlo methods for which the asymptotic convergence rate is independent of the dimensionality. Even though the asymptotic convergence rate is known, predicting the actual value of the statistical error remains a challenging problem. In this paper, we show how the statistical error of an analog particle tracing Monte Carlo method can be calculated (expensive) and predicted a priori (cheap) when estimating quantities of interest (QoI) on a histogram. We consider two types of QoI estimators: point estimators for which each particle provides one independent contribution to the QoI estimates, and analog estimators for which each particle provides multiple correlated contributions to the QoI estimates. The developed statistical error predictors can be applied to other QoI estimators and nonanalog simulation routines as well. The error analysis is based on interpreting the number of particle visits to a histogram bin as the result of a (correlated) binomial experiment. The resulting expressions can be used to optimize (non)analog particle tracing Monte Carlo methods and hybrid simulation methods involving a Monte Carlo component, as well as to select an optimal particle tracing Monte Carlo method from several available options. Additionally, the cheap statistical error predictors can be used to determine a priori the number of particles N that is needed to reach a desired accuracy. We illustrate the theory using a linear kinetic equation describing neutral particles in the plasma edge of a fusion device and show numerical results. The code used to perform the numerical experiments is openly available.
Paper Structure (37 sections, 105 equations, 7 figures)

This paper contains 37 sections, 105 equations, 7 figures.

Figures (7)

  • Figure 1: Particle paths (bottom) and corresponding histograms counting the number of collisions happening inside each bin (top). Left: three realizations with $u_p = 0$, $\sigma_p^2 = 1$, $R_i = 0$, $R_{cx} = 1$. Right: three realizations with $u_p = 0$, $\sigma_p^2 = 1$, $R_i = 0$, $R_{cx} = 25$. The red dots indicate the initial particle positions. All simulations are terminated after 100 collisions.
  • Figure 2: Markov dependence based variance expression for large $L$\ref{['eq:binomial_variance_markovdependence_klotz_largeN']} divided by $L$ for $0 \leq p_j \leq 1$ and $0 \leq \lambda_j \leq 0.9$. For large values of $\lambda_j$, transitions between success and failure become rare events, inflating the variance. The blue line represents the variance of a binomial experiment with $\lambda_j = p_j$, i.e., independent Bernoulli trials \ref{['eq:binomial_variance_independent_stationary']} divided by $L$. The red line represents the feasibility border \ref{['eq:Klotz_feasibility_border']}. Left: top view. Right: 3D view.
  • Figure 3: Variance divided by $L$ for $L=10^3$ Bernoulli trials. Left: hydrodynamic scaling. Middle: temperature scaling. Right: diffusive scaling. The empirical variance (blue) is calculated based on $10^4$ realizations. The upper bound (red) is given by \ref{['eq:binomial_variance_simple_upperbound']}. The expression for independent Bernoulli trials (yellow) is given by \ref{['eq:binomial_variance_independent_stationary']}. The used Markov dependence approximation (green) is given by \ref{['eq:binomial_variance_markovdependence_klotz']}. The variance expression based on a discretized hidden Markov process (purple) is explained in Section \ref{['sec:HMP']}.
  • Figure 4: Analog particle tracing Monte Carlo with point estimator. Particles are generated from a random source that is uniform in time, there is no ionization sink ($R_i=0$). Top to bottom: variance on density, momentum and energy. Left to right: hydrodynamic, temperature and diffusive scaling.
  • Figure 5: Analog particle tracing Monte Carlo with point estimator. Particles are generated from a random source that is uniform in time, there is an ionization sink with $R_i = 1$. Top to bottom: variance on density, momentum and energy. Left to right: hydrodynamic, temperature and diffusive scaling.
  • ...and 2 more figures