Table of Contents
Fetching ...

LeStrat-Net: Lebesgue style stratification for Monte Carlo simulations powered by machine learning

Kayoung Ban, Myeonghun Park, Raymundo Ramos

TL;DR

This work introduces a Lebesgue-style stratification for Monte Carlo integration that partitions the domain according to isocontours of the integrand, enabling regions of arbitrary shape and size. A neural network acts as the central divider, learning region boundaries and estimating region volumes to drive variance reduction and efficient sampling, supplemented by an iterative training loop that refines divisions as more data become available. The approach is demonstrated on a suite of test functions and extended to scattering-event generation, showing performance comparable to established methods while enabling targeted sampling of high-contribution regions and complex cancellation patterns. The framework promises practical gains for computationally intensive high-energy physics simulations by moving the most challenging tasks (region boundary discovery and volume estimation) to fast, learned models, with clear pathways for extension to more regions and broader event-generation workflows.

Abstract

We develop a machine learning algorithm to turn around stratification in Monte Carlo sampling. We use a different way to divide the domain space of the integrand, based on the height of the function being sampled, similar to what is done in Lebesgue integration. This means that isocontours of the function define regions that can have any shape depending on the behavior of the function. We take advantage of the capacity of neural networks to learn complicated functions in order to predict these complicated divisions and preclassify large samples of the domain space. From this preclassification we can select the required number of points to perform a number of tasks such as variance reduction, integration and even event selection. The network ultimately defines the regions with what it learned and is also used to calculate the multi-dimensional volume of each region.

LeStrat-Net: Lebesgue style stratification for Monte Carlo simulations powered by machine learning

TL;DR

This work introduces a Lebesgue-style stratification for Monte Carlo integration that partitions the domain according to isocontours of the integrand, enabling regions of arbitrary shape and size. A neural network acts as the central divider, learning region boundaries and estimating region volumes to drive variance reduction and efficient sampling, supplemented by an iterative training loop that refines divisions as more data become available. The approach is demonstrated on a suite of test functions and extended to scattering-event generation, showing performance comparable to established methods while enabling targeted sampling of high-contribution regions and complex cancellation patterns. The framework promises practical gains for computationally intensive high-energy physics simulations by moving the most challenging tasks (region boundary discovery and volume estimation) to fast, learned models, with clear pathways for extension to more regions and broader event-generation workflows.

Abstract

We develop a machine learning algorithm to turn around stratification in Monte Carlo sampling. We use a different way to divide the domain space of the integrand, based on the height of the function being sampled, similar to what is done in Lebesgue integration. This means that isocontours of the function define regions that can have any shape depending on the behavior of the function. We take advantage of the capacity of neural networks to learn complicated functions in order to predict these complicated divisions and preclassify large samples of the domain space. From this preclassification we can select the required number of points to perform a number of tasks such as variance reduction, integration and even event selection. The network ultimately defines the regions with what it learned and is also used to calculate the multi-dimensional volume of each region.

Paper Structure

This paper contains 24 sections, 40 equations, 17 figures.

Figures (17)

  • Figure 1: The main purpose of the neural network is to help us classify the points into their corresponding regions. Here, $\mathcal{F}$ is the activation function. In the multilabel approach, for the point $\{\vec{x}\}$, if $f(\vec{x})$ lies above the isocontour assigned to a particular value of $j$, it is assigned a positive label, and if it is below, it receives a negative label (tanh activation function). If a sigmoid activation function is chosen, a label of zero is assigned instead of a negative label.
  • Figure 2: Example of a two-cones function with a 2-dimensional base divided into 6 regions labeled from 0 to 5. The bases of the cones are centered at (2.5, 2.5) and (7.5, 7.5) with a maximum height of 2.5. Regions labeled 1 to 5 correspond to regions limited by $\{0, 0.5, 1.0, 1.5, 2.0, 2.5\}$ while the region labeled 0 is for $f_\text{cones} = 0$. The panel on the left shows color coded regions and limits projected on the space of the two-dimensional base. The panel on the right shows the cones in 3-dimensional space, with darker lines to visualize how the limits are applied on $f_\text{cones}$. Note that this example is for visualization. See the text for a detailed description of a test with a 6-dimensional base divided in 12 regions.
  • Figure 3: Evolution of two metrics, accuracy (top row) and average jumping between regions (bottom row), for the combination of softmax output activation function and categorical cross-entropy (CCE) loss. In the text this is described as multiclass approach. The training is done for 2000 epochs. The panels on the left show metrics on the training set while panels on the right show metrics on the validation set. For CCE without modification only the best out of 10 trainings is shown (black), while 10 trainings with several modifications (described in the text) are shown in gray. None of the proposed modifications brings a considerable improvement of the metrics.
  • Figure 4: Evolution of two metrics, accuracy (top row) and average jumping between regions (bottom row), for the combination of sigmoid output activation function and binary cross-entropy (BCE) loss. In the text this is described as multilabel approach. The training is done for 2000 epochs. The panels on the left show metrics on the training set while panels on the right show metrics on the validation set. For BCE without modification only the best out of 10 trainings is shown (gray), while 10 trainings for the two best performing modifications are shown in red and blue. None of the proposed modifications brings a considerable improvement of the metrics.
  • Figure 5: Evolution of two metrics, accuracy (top row) and average jumping between regions (bottom row), for the combination of tanh output activation function and squared hinge (SH) loss. In the text this is described as multilabel approach. The training is done for 2000 epochs. The panels on the left show metrics on the training set while panels on the right show metrics on the validation set. For SH without modification only the best out of 10 trainings is shown (gray), while 10 trainings for the two best performing modifications are shown in red and blue. None of the proposed modifications brings a considerable improvement of the metrics.
  • ...and 12 more figures