Table of Contents
Fetching ...

A Three-Stage Bayesian Transfer Learning Framework to Improve Predictions in Data-Scarce Domains

Aidan Furlong, Robert Salko, Xingang Zhao, Xu Wu

TL;DR

The paper tackles data scarcity and domain shift in engineering predictions by proposing a three-stage transfer framework (staged B-DANN) that fuses parameter transfer, domain-adversarial alignment, and Bayesian fine-tuning to yield calibrated uncertainties. Stage 1 pretrains a deterministic feature extractor on the source domain, Stage 2 performs domain alignment via a DANN-like adversarial objective with a frozen regression head to stabilize learning, and Stage 3 replaces the deterministic parts with a Bayesian neural network for target-domain fine-tuning and uncertainty quantification. Across a synthetic benchmark and a CHF prediction task in rectangular channels, staged B-DANN consistently outperforms From-Scratch and Direct Transfer, while providing well-calibrated uncertainty estimates and improved run-to-run stability. The method offers a practical path for reliable predictions in data-scarce, safety-critical engineering settings and can extend to other domains facing covariate and conditional shifts.

Abstract

The use of ML in engineering has grown steadily to support a wide array of applications. Among these methods, deep neural networks have been widely adopted due to their performance and accessibility, but they require large, high-quality datasets. Experimental data are often sparse, noisy, or insufficient to build resilient data-driven models. Transfer learning, which leverages relevant data-abundant source domains to assist learning in data-scarce target domains, has shown efficacy. Parameter transfer, where pretrained weights are reused, is common but degrades under large domain shifts. Domain-adversarial neural networks (DANNs) help address this issue by learning domain-invariant representations, thereby improving transfer under greater domain shifts in a semi-supervised setting. However, DANNs can be unstable during training and lack a native means for uncertainty quantification. This study introduces a fully-supervised three-stage framework, the staged Bayesian domain-adversarial neural network (staged B-DANN), that combines parameter transfer and shared latent space adaptation. In Stage 1, a deterministic feature extractor is trained on the source domain. This feature extractor is then adversarially refined using a DANN in Stage 2. In Stage 3, a Bayesian neural network is built on the adapted feature extractor for fine-tuning on the target domain to handle conditional shifts and yield calibrated uncertainty estimates. This staged B-DANN approach was first validated on a synthetic benchmark, where it was shown to significantly outperform standard transfer techniques. It was then applied to the task of predicting critical heat flux in rectangular channels, leveraging data from tube experiments as the source domain. The results of this study show that the staged B-DANN method can improve predictive accuracy and generalization, potentially assisting other domains in nuclear engineering.

A Three-Stage Bayesian Transfer Learning Framework to Improve Predictions in Data-Scarce Domains

TL;DR

The paper tackles data scarcity and domain shift in engineering predictions by proposing a three-stage transfer framework (staged B-DANN) that fuses parameter transfer, domain-adversarial alignment, and Bayesian fine-tuning to yield calibrated uncertainties. Stage 1 pretrains a deterministic feature extractor on the source domain, Stage 2 performs domain alignment via a DANN-like adversarial objective with a frozen regression head to stabilize learning, and Stage 3 replaces the deterministic parts with a Bayesian neural network for target-domain fine-tuning and uncertainty quantification. Across a synthetic benchmark and a CHF prediction task in rectangular channels, staged B-DANN consistently outperforms From-Scratch and Direct Transfer, while providing well-calibrated uncertainty estimates and improved run-to-run stability. The method offers a practical path for reliable predictions in data-scarce, safety-critical engineering settings and can extend to other domains facing covariate and conditional shifts.

Abstract

The use of ML in engineering has grown steadily to support a wide array of applications. Among these methods, deep neural networks have been widely adopted due to their performance and accessibility, but they require large, high-quality datasets. Experimental data are often sparse, noisy, or insufficient to build resilient data-driven models. Transfer learning, which leverages relevant data-abundant source domains to assist learning in data-scarce target domains, has shown efficacy. Parameter transfer, where pretrained weights are reused, is common but degrades under large domain shifts. Domain-adversarial neural networks (DANNs) help address this issue by learning domain-invariant representations, thereby improving transfer under greater domain shifts in a semi-supervised setting. However, DANNs can be unstable during training and lack a native means for uncertainty quantification. This study introduces a fully-supervised three-stage framework, the staged Bayesian domain-adversarial neural network (staged B-DANN), that combines parameter transfer and shared latent space adaptation. In Stage 1, a deterministic feature extractor is trained on the source domain. This feature extractor is then adversarially refined using a DANN in Stage 2. In Stage 3, a Bayesian neural network is built on the adapted feature extractor for fine-tuning on the target domain to handle conditional shifts and yield calibrated uncertainty estimates. This staged B-DANN approach was first validated on a synthetic benchmark, where it was shown to significantly outperform standard transfer techniques. It was then applied to the task of predicting critical heat flux in rectangular channels, leveraging data from tube experiments as the source domain. The results of this study show that the staged B-DANN method can improve predictive accuracy and generalization, potentially assisting other domains in nuclear engineering.

Paper Structure

This paper contains 22 sections, 14 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Training workflow for the three-stage TL approach. Note that the stage two $\hat{y}_S$ output is only used for diagnostic purposes and does not affect training gradients.
  • Figure 2: Comparison of prediction parity in the 75-point and 500-point training data groups.
  • Figure 3: Direct comparison of each ML-based method's $\mu_{\text{error}}$ values with $\pm$95% CIs, both from the aggregated 20-seed ensemble.
  • Figure 4: Analysis of uncertainty estimates produced by the 500-point staged B-DANN approach via uncertainty calibration and distribution.
  • Figure 5: Overview of the hybrid model workflow in training configuration.
  • ...and 4 more figures