Table of Contents
Fetching ...

MelissaDL x Breed: Towards Data-Efficient On-line Supervised Training of Multi-parametric Surrogates with Active Learning

Sofya Dymchenko, Abhishek Purandare, Bruno Raffin

TL;DR

This paper introduces a new active learning method to enhance data-efficiency for on-line surrogate training that uses Adaptive Multiple Importance Sampling guided by training loss statistics, in order to focus NN training on the difficult areas of the parameter space.

Abstract

Artificial intelligence is transforming scientific computing with deep neural network surrogates that approximate solutions to partial differential equations (PDEs). Traditional off-line training methods face issues with storage and I/O efficiency, as the training dataset has to be computed with numerical solvers up-front. Our previous work, the Melissa framework, addresses these problems by enabling data to be created "on-the-fly" and streamed directly into the training process. In this paper we introduce a new active learning method to enhance data-efficiency for on-line surrogate training. The surrogate is direct and multi-parametric, i.e., it is trained to predict a given timestep directly with different initial and boundary conditions parameters. Our approach uses Adaptive Multiple Importance Sampling guided by training loss statistics, in order to focus NN training on the difficult areas of the parameter space. Preliminary results for 2D heat PDE demonstrate the potential of this method, called Breed, to improve the generalization capabilities of surrogates while reducing computational overhead.

MelissaDL x Breed: Towards Data-Efficient On-line Supervised Training of Multi-parametric Surrogates with Active Learning

TL;DR

This paper introduces a new active learning method to enhance data-efficiency for on-line surrogate training that uses Adaptive Multiple Importance Sampling guided by training loss statistics, in order to focus NN training on the difficult areas of the parameter space.

Abstract

Artificial intelligence is transforming scientific computing with deep neural network surrogates that approximate solutions to partial differential equations (PDEs). Traditional off-line training methods face issues with storage and I/O efficiency, as the training dataset has to be computed with numerical solvers up-front. Our previous work, the Melissa framework, addresses these problems by enabling data to be created "on-the-fly" and streamed directly into the training process. In this paper we introduce a new active learning method to enhance data-efficiency for on-line surrogate training. The surrogate is direct and multi-parametric, i.e., it is trained to predict a given timestep directly with different initial and boundary conditions parameters. Our approach uses Adaptive Multiple Importance Sampling guided by training loss statistics, in order to focus NN training on the difficult areas of the parameter space. Preliminary results for 2D heat PDE demonstrate the potential of this method, called Breed, to improve the generalization capabilities of surrogates while reducing computational overhead.
Paper Structure (18 sections, 11 equations, 6 figures, 1 table)

This paper contains 18 sections, 11 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: A visual presentation of the sampling algorithm starting with $N=7$ initial locations, that are next weighted to build the Gaussian mixture proposal and sample from this distribution $K=10$ new samples. Last, 30% of these points are discarded and substituted with uniform points to maintain exploration capabilities.
  • Figure 2: The server's communication with the launcher for the input parameters update mechanism. The number of simulations to run is defined by the budget $n$.
  • Figure 3: Experimental study over hyperparameters. The changing parameter is indicated in each legend box. The training curve is averaged with a moving window of 40 iterations (dotted line) for visibility. The Y-axis is a logarithmic scale, and it is shared across all plots. Values presented near the curves are the last validation loss values.
  • Figure 4: Input parameter deviation histogram obtained from one run of 800 input parameters. On the left (a), comparison per source of point (whether uniform or proposal) for one Breed run; on the right (b), comparison of two runs (Random and Breed). The mean is plotted to better see the distribution shift.
  • Figure 5: The Melissa framework architecture consisting of launcher (orchestrates the process through the cluster's batch scheduler), clients jobs (where simulations are executed), and a server (trains NN and manages reservoirs).
  • ...and 1 more figures