Active learning-based variance reduction for Monte Carlo simulations: A feasibility study for the nanodosimetry around a gold nanoparticle
Leo Thomas, Miriam Schwarze, Hans Rabus
TL;DR
This paper tackles the high computational cost of nanodosimetric Monte Carlo simulations around a gold NP by introducing a data-driven variance-reduction strategy. It develops an active-learning framework that uses a Gaussian Process Sampler to iteratively optimize an importance distribution $q(b)$ over impact parameters, guided by a loss that combines Wasserstein-1 distance and regularization. The method is coupled to Geant4 via a TCP interface to obtain $F_4$-cluster dose tallies and their shell-mean values, enabling efficient estimation of the $F_4$ cluster dose as a function of radius. Results show substantial efficiency gains and reasonable agreement with reference data near the NP, demonstrating proof-of-principle viability for ill-posed sampling problems in nanodosimetry, with clear paths for generalization and automation in AI-assisted MC workflows.
Abstract
Objective: This work presents a data-driven importance sampling-based variance reduction (VR) scheme inspired by active learning. The method is applied to the estimation of an optimal impact-parameter distribution in the calculation of ionization clusters around a gold nanoparticle (NP). Here, such an optimal importance distribution can not be inferred from principle. Approach: An iterative optimization procedure is set up that uses a Gaussian Process Sampler to propose optimal sampling distributions based on a loss function. The loss is constructed based on appropriate heuristics. The optimization code obtains estimates of the number of ionization clusters in shells around the NP by interfacing with a Geant4 simulation via a dedicated Transmission Control Protocol (TCP) interface. Main results: It is shown that the so-derived impact-parameter distribution easily outperforms the actual, uniform irradiation case. The results resemble those obtained with other VR schemes but do still slightly overestimate background contributions. Significance: While the method presented is a proof-of-principle, it provides a novel method of estimating importance distributions in ill-posed scenarios. The presented TCP interface described here is a simple and efficient method to expose compiled Geant4 code to other scripts, written for example, in Python.
