Table of Contents
Fetching ...

PDE foundation model-accelerated inverse estimation of system parameters in inertial confinement fusion

Mahindra Rautela, Alexander Scheinker, Bradley Love, Diane Oyen, Nathan DeBardeleben, Earl Lawrence, Ayan Biswas

TL;DR

Finetuning from pretrained MORPH weights outperforms training the same architecture from scratch, demonstrating that foundation-model initialization improves sample efficiency for data-limited inverse problems in ICF.

Abstract

PDE foundation models are typically pretrained on large, diverse corpora of PDE datasets and can be adapted to new settings with limited task-specific data. However, most downstream evaluations focus on forward problems, such as autoregressive rollout prediction. In this work, we study an inverse problem in inertial confinement fusion (ICF): estimating system parameters (inputs) from multi-modal, snapshot-style observations (outputs). Using the open JAG benchmark, which provides hyperspectral X-ray images and scalar observables per simulation, we finetune the PDE foundation model and train a lightweight task-specific head to jointly reconstruct hyperspectral images and regress system parameters. The fine-tuned model achieves accurate hyperspectral reconstruction (test MSE 1.2e-3) and strong parameter-estimation performance (up to R^2=0.995). Data-scaling experiments (5%-100% of the training set) show consistent improvements in both reconstruction and regression losses as the amount of training data increases, with the largest marginal gains in the low-data regime. Finally, finetuning from pretrained MORPH weights outperforms training the same architecture from scratch, demonstrating that foundation-model initialization improves sample efficiency for data-limited inverse problems in ICF.

PDE foundation model-accelerated inverse estimation of system parameters in inertial confinement fusion

TL;DR

Finetuning from pretrained MORPH weights outperforms training the same architecture from scratch, demonstrating that foundation-model initialization improves sample efficiency for data-limited inverse problems in ICF.

Abstract

PDE foundation models are typically pretrained on large, diverse corpora of PDE datasets and can be adapted to new settings with limited task-specific data. However, most downstream evaluations focus on forward problems, such as autoregressive rollout prediction. In this work, we study an inverse problem in inertial confinement fusion (ICF): estimating system parameters (inputs) from multi-modal, snapshot-style observations (outputs). Using the open JAG benchmark, which provides hyperspectral X-ray images and scalar observables per simulation, we finetune the PDE foundation model and train a lightweight task-specific head to jointly reconstruct hyperspectral images and regress system parameters. The fine-tuned model achieves accurate hyperspectral reconstruction (test MSE 1.2e-3) and strong parameter-estimation performance (up to R^2=0.995). Data-scaling experiments (5%-100% of the training set) show consistent improvements in both reconstruction and regression losses as the amount of training data increases, with the largest marginal gains in the low-data regime. Finally, finetuning from pretrained MORPH weights outperforms training the same architecture from scratch, demonstrating that foundation-model initialization improves sample efficiency for data-limited inverse problems in ICF.
Paper Structure (12 sections, 6 figures)

This paper contains 12 sections, 6 figures.

Figures (6)

  • Figure 1: Schematic illustrating how the PDE foundation model (MORPH) latent representation is coupled with a lightweight task-specific head. Hyperspectral images are provided as input to the foundation model, which is fine-tuned for reconstruction. The transformer-block outputs are passed to a task-specific head (a dense neural network). In parallel, the task-specific head (TSH) also ingests 15 scalar observables/diagnostics and is trained to predict a 5D parameter output. The foundation model and TSH are trained jointly end-to-end, using separate loss functions with independent optimizers and learning-rate schedulers.
  • Figure 2: Ridge-regression sensitivity map for predicting the five parameters from multi-modal features. Columns show standardized image PCA scores (PC1--PC32) and scalar diagnostics (scalar0--scalar14), separated by the vertical line; colors denote signed coefficient magnitude. The strongest dependencies are scalar-driven for param1 and param2, while param0 and param3 exhibit near-zero coefficients, indicating poor identifiability.
  • Figure 3: Reconstruction results on four held-out test samples (a-d). For each example, the top row shows the ground-truth four-channel hyperspectral image, and the bottom row shows the corresponding reconstructed (predicted) image. Across the full test set, the recorded MSE between true and reconstructed images is 0.0012.
  • Figure 4: Parameter estimation results on the test set. Scatter plots compare ground-truth versus predicted values for each of the three predicted parameters. We achieve strong agreement for three parameters: Param1 attains $R^2=0.975$ with $L_2=0.035$, Param2 attains $R^2=0.995$ with $L_2=0.013$, and Param4 attains $R^2=0.990$ with $L_2=0.022$.
  • Figure 5: Data-scaling study showing training and validation loss curves (log-log scale) as the fraction of available training data is varied (5%, 10%, 25%, 50%, 75%, and 100 %). (a) MORPH reconstruction training loss, (b) task-specific head (TSH) parameter-regression training loss, (c) MORPH reconstruction validation loss, and (d) TSH parameter-regression validation loss, all plotted versus epochs (log-scaled axes).
  • ...and 1 more figures