Table of Contents
Fetching ...

Towards Overcoming Data Scarcity in Nuclear Energy: A Study on Critical Heat Flux with Physics-consistent Conditional Diffusion Model

Farah Alsafadi, Alexandra Akins, Xu Wu

TL;DR

CHF data scarcity in nuclear energy is addressed by training diffusion models (DMs) to augment data and a physics-aware conditional diffusion model (CDM) to generate CHF under specified TH conditions, all grounded on the public CHF dataset with $P$, $G$, $D$, $L$, and $x$ as inputs and CHF as the target. The vanilla DM learns the joint distribution of six variables and demonstrates distributional and correlation fidelity, while the CDM enables targeted CHF generation with quantified uncertainty via $T$ steps and multiple noise realizations, achieving $R^2\approx0.98$ and mean absolute relative error around $6.0$–$6.9\%$ on held-out CHF data. A physical-consistency validation is performed by comparing outlet quality $x$ derived from energy-balance equations across measured, calculated, and generated data, showing close agreement and small absolute errors. Overall, the work provides a data-augmentation framework that preserves physical relationships in CHF and offers a pathway to more reliable ML in nuclear safety analyses, with future directions including transfer learning to extend beyond the training domain.

Abstract

Deep generative modeling provides a powerful pathway to overcome data scarcity in energy-related applications where experimental data are often limited, costly, or difficult to obtain. By learning the underlying probability distribution of the training dataset, deep generative models, such as the diffusion model (DM), can generate high-fidelity synthetic samples that statistically resemble the training data. Such synthetic data generation can significantly enrich the size and diversity of the available training data, and more importantly, improve the robustness of downstream machine learning models in predictive tasks. The objective of this paper is to investigate the effectiveness of DM for overcoming data scarcity in nuclear energy applications. By leveraging a public dataset on critical heat flux (CHF) that cover a wide range of commercial nuclear reactor operational conditions, we developed a DM that can generate an arbitrary amount of synthetic samples for augmenting of the CHF dataset. Since a vanilla DM can only generate samples randomly, we also developed a conditional DM capable of generating targeted CHF data under user-specified thermal-hydraulic conditions. The performance of the DM was evaluated based on their ability to capture empirical feature distributions and pair-wise correlations, as well as to maintain physical consistency. The results showed that both the DM and conditional DM can successfully generate realistic and physics-consistent CHF data. Furthermore, uncertainty quantification was performed to establish confidence in the generated data. The results demonstrated that the conditional DM is highly effective in augmenting CHF data while maintaining acceptable levels of uncertainty.

Towards Overcoming Data Scarcity in Nuclear Energy: A Study on Critical Heat Flux with Physics-consistent Conditional Diffusion Model

TL;DR

CHF data scarcity in nuclear energy is addressed by training diffusion models (DMs) to augment data and a physics-aware conditional diffusion model (CDM) to generate CHF under specified TH conditions, all grounded on the public CHF dataset with , , , , and as inputs and CHF as the target. The vanilla DM learns the joint distribution of six variables and demonstrates distributional and correlation fidelity, while the CDM enables targeted CHF generation with quantified uncertainty via steps and multiple noise realizations, achieving and mean absolute relative error around on held-out CHF data. A physical-consistency validation is performed by comparing outlet quality derived from energy-balance equations across measured, calculated, and generated data, showing close agreement and small absolute errors. Overall, the work provides a data-augmentation framework that preserves physical relationships in CHF and offers a pathway to more reliable ML in nuclear safety analyses, with future directions including transfer learning to extend beyond the training domain.

Abstract

Deep generative modeling provides a powerful pathway to overcome data scarcity in energy-related applications where experimental data are often limited, costly, or difficult to obtain. By learning the underlying probability distribution of the training dataset, deep generative models, such as the diffusion model (DM), can generate high-fidelity synthetic samples that statistically resemble the training data. Such synthetic data generation can significantly enrich the size and diversity of the available training data, and more importantly, improve the robustness of downstream machine learning models in predictive tasks. The objective of this paper is to investigate the effectiveness of DM for overcoming data scarcity in nuclear energy applications. By leveraging a public dataset on critical heat flux (CHF) that cover a wide range of commercial nuclear reactor operational conditions, we developed a DM that can generate an arbitrary amount of synthetic samples for augmenting of the CHF dataset. Since a vanilla DM can only generate samples randomly, we also developed a conditional DM capable of generating targeted CHF data under user-specified thermal-hydraulic conditions. The performance of the DM was evaluated based on their ability to capture empirical feature distributions and pair-wise correlations, as well as to maintain physical consistency. The results showed that both the DM and conditional DM can successfully generate realistic and physics-consistent CHF data. Furthermore, uncertainty quantification was performed to establish confidence in the generated data. The results demonstrated that the conditional DM is highly effective in augmenting CHF data while maintaining acceptable levels of uncertainty.

Paper Structure

This paper contains 14 sections, 9 equations, 14 figures, 3 tables.

Figures (14)

  • Figure 1: Illustration of the diffusion and generation processes of a DM.
  • Figure 2: The distributions and correlations of the TH parameters and CHF values in the NRC CHF dataset.
  • Figure 3: Flowchart outlining the process of CDM training.
  • Figure 4: Comparison of the TH parameters and CHF distributions between the real and DM-generated data.
  • Figure 5: Comparison of the CHF-TH-parameter pairwise correlations between the real and DM-generated data.
  • ...and 9 more figures