Stabilizing Test-Time Adaptation of High-Dimensional Simulation Surrogates via D-Optimal Statistics

Anna Zimmel; Paul Setinek; Gianluca Galletti; Johannes Brandstetter; Werner Zellinger

Stabilizing Test-Time Adaptation of High-Dimensional Simulation Surrogates via D-Optimal Statistics

Anna Zimmel, Paul Setinek, Gianluca Galletti, Johannes Brandstetter, Werner Zellinger

TL;DR

The paper tackles distribution shifts in neural surrogates for high-dimensional PDE simulations by proposing SATTS, a test-time adaptation framework that uses D-optimal latent statistics to stabilize adaptation across regression and generation tasks. SATTS integrates three pillars—feature alignment, source knowledge preservation, and unsupervised parameter tuning via density-ratio IWV—achieving consistent zero-shot improvements (up to about 7%) on SIMSHIFT and EngiBench with minimal overhead. The approach demonstrates strong stability where prior TTA methods can fail, and it highlights practical potential for physics-driven, zero-shot adaptation in industrial design and analysis. This work lays groundwork for physics-informed TTA and uncertainty-aware strategies in high-dimensional simulation contexts.

Abstract

Machine learning surrogates are increasingly used in engineering to accelerate costly simulations, yet distribution shifts between training and deployment often cause severe performance degradation (e.g., unseen geometries or configurations). Test-Time Adaptation (TTA) can mitigate such shifts, but existing methods are largely developed for lower-dimensional classification with structured outputs and visually aligned input-output relationships, making them unstable for the high-dimensional, unstructured and regression problems common in simulation. We address this challenge by proposing a TTA framework based on storing maximally informative (D-optimal) statistics, which jointly enables stable adaptation and principled parameter selection at test time. When applied to pretrained simulation surrogates, our method yields up to 7% out-of-distribution improvements at negligible computational cost. To the best of our knowledge, this is the first systematic demonstration of effective TTA for high-dimensional simulation regression and generative design optimization, validated on the SIMSHIFT and EngiBench benchmarks.

Stabilizing Test-Time Adaptation of High-Dimensional Simulation Surrogates via D-Optimal Statistics

TL;DR

Abstract

Paper Structure (17 sections, 10 equations, 6 figures, 8 tables, 1 algorithm)

This paper contains 17 sections, 10 equations, 6 figures, 8 tables, 1 algorithm.

Introduction
Related Work
Problem
Method
Maximally informative statistics
SATTS
Experiments
Datasets
Neural Surrogates for Simulation: SIMSHIFT
Generative Design Optimization: EngiBench
Ablations
Conclusion and Future Work
Supplementary Approach Information
TTA Training
Experimental Setup
...and 2 more sections

Figures (6)

Figure 1: Our method applied to hot rolling task from setinek2025simshift. (a) Pre-training on the source domain with fixed input parameters, thickness ($\tau$), post-rolling reduction ($r$), and temperature coefficients ($\lambda_a$, $\lambda_b$). The representation learner$\phi$ and the predictor$g$ are optimized and maximally informative (D-optimal) statistics are computed. (b) Test-time adaptation of $\phi$ without source data using D-optimal statistics for realizing three TTA pillars: adaptation (KL-based feature alignment), source knowledge preservation (statistics-based regularization) and parameter tuning (importance weighted validation).
Figure 2: Comparison of Equivalent Plastic Strain (PEEQ) predictions on a hot rolling sample. Displaying the Ground Truth (GT), the unadapted Source model, and the SATTS results in the top row, with the absolute residuals, $|\text{GT} - \text{Source}|$ and $|\text{GT} - \text{SATTS}|$ in the bottom row.
Figure 3: Relative performance improvements of SATTS and the Oracle (lower bound for model selection) compared to the Source model, measured by RMSE.
Figure 4: Comparison of 2D beam topology results based on ground truth, source prediction, and SATTS. The heatmaps illustrate material density $\rho \in [0, 1]$. SATTS shows stronger alignment with the original design, resulting in more robust design outcomes.
Figure 5: t-SNE visualization of the conditioner's latent space on the structural beam bending dataset. While overhang_constraint and forcedist are either constant or exhibit almost a uniform distribution, volfrac and rmin exhibit a clear structure.
...and 1 more figures

Stabilizing Test-Time Adaptation of High-Dimensional Simulation Surrogates via D-Optimal Statistics

TL;DR

Abstract

Stabilizing Test-Time Adaptation of High-Dimensional Simulation Surrogates via D-Optimal Statistics

Authors

TL;DR

Abstract

Table of Contents

Figures (6)