Table of Contents
Fetching ...

Machine Learning in Proton Exchange Membrane Water Electrolysis -- Part I: A Knowledge-Integrated Framework

Xia Chen, Alexander Rex, Janis Woelke, Christoph Eckert, Boris Bensmann, Richard Hanke-Rauschenbach, Philipp Geyer

TL;DR

This work proposes a Knowledge-integrated Machine Learning framework for PEMWE, introducing the Ladder of Knowledge-integrated ML to fuse data-driven methods with domain knowledge across three levels: interpolation, extrapolation, and representation. Through Level 1 data augmentation and STL decomposition, Level 2 physics-informed modeling (including PINNs) and partial knowledge integration, and Level 3 symbolic knowledge discovery via phi-SO, the authors demonstrate improved predictive accuracy and interpretability in degradation forecasting of PEMWE cells. The approach addresses data and modeling uncertainties, offers generalizable insights for engineering, and highlights the potential of symbolic discovery to reveal physics-based relationships from data. Collectively, this framework aims to accelerate PEMWE development and broader AI-for-science applications in complex engineering systems.

Abstract

In this study, we propose to adopt a novel framework, Knowledge-integrated Machine Learning, for advancing Proton Exchange Membrane Water Electrolysis (PEMWE) development. Given the significance of PEMWE in green hydrogen production and the inherent challenges in optimizing its performance, our framework aims to meld data-driven models with domain-specific insights systematically to address the domain challenges. We first identify the uncertainties originating from data acquisition conditions, data-driven model mechanisms, and domain expertise, highlighting their complementary characteristics in carrying information from different perspectives. Building upon this foundation, we showcase how to adeptly decompose knowledge and extract unique information to contribute to the data augmentation, modeling process, and knowledge discovery. We demonstrate a hierarchical three-level framework, termed the "Ladder of Knowledge-integrated Machine Learning", in the PEMWE context, applying it to three case studies within a context of cell degradation analysis to affirm its efficacy in interpolation, extrapolation, and information representation. This research lays the groundwork for more knowledge-informed enhancements in ML applications in engineering.

Machine Learning in Proton Exchange Membrane Water Electrolysis -- Part I: A Knowledge-Integrated Framework

TL;DR

This work proposes a Knowledge-integrated Machine Learning framework for PEMWE, introducing the Ladder of Knowledge-integrated ML to fuse data-driven methods with domain knowledge across three levels: interpolation, extrapolation, and representation. Through Level 1 data augmentation and STL decomposition, Level 2 physics-informed modeling (including PINNs) and partial knowledge integration, and Level 3 symbolic knowledge discovery via phi-SO, the authors demonstrate improved predictive accuracy and interpretability in degradation forecasting of PEMWE cells. The approach addresses data and modeling uncertainties, offers generalizable insights for engineering, and highlights the potential of symbolic discovery to reveal physics-based relationships from data. Collectively, this framework aims to accelerate PEMWE development and broader AI-for-science applications in complex engineering systems.

Abstract

In this study, we propose to adopt a novel framework, Knowledge-integrated Machine Learning, for advancing Proton Exchange Membrane Water Electrolysis (PEMWE) development. Given the significance of PEMWE in green hydrogen production and the inherent challenges in optimizing its performance, our framework aims to meld data-driven models with domain-specific insights systematically to address the domain challenges. We first identify the uncertainties originating from data acquisition conditions, data-driven model mechanisms, and domain expertise, highlighting their complementary characteristics in carrying information from different perspectives. Building upon this foundation, we showcase how to adeptly decompose knowledge and extract unique information to contribute to the data augmentation, modeling process, and knowledge discovery. We demonstrate a hierarchical three-level framework, termed the "Ladder of Knowledge-integrated Machine Learning", in the PEMWE context, applying it to three case studies within a context of cell degradation analysis to affirm its efficacy in interpolation, extrapolation, and information representation. This research lays the groundwork for more knowledge-informed enhancements in ML applications in engineering.
Paper Structure (20 sections, 4 equations, 7 figures, 4 tables)

This paper contains 20 sections, 4 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: The manuscript skeleton of Knowledge-integrated Machine Learning in PEMWEs development
  • Figure 2: Three perspectives of knowledge-based decomposition in an example of degradation forecasting in PEMWE. Knowledge, including domain know-how, scientific/mathematical methodologies, and different system complexities/scales, and the exemplary respective subtopics, such as the degradation type or the single cell behavior, is embedded to help decompose the energy demand time series and gain more information to understand degradation.
  • Figure 3: The Ladder of Knowledge-integrated Machine Learning chen2023pathway. The three levels in the pathway state their difference and core ability, linking to their typical methods and characteristic descriptions. Level 1 - Interpolation: domain knowledge is embedded in data argumentation and feature engineering to achieve better performance for ML methods; Level 2 - Extrapolation: incorporating domain knowledge into the data-driven modeling process to enable informed predictions beyond the observation range of training data; Level 3 - Representation: incorporating knowledge discovery or learning mechanism into the model to transform effective information concisely. A higher level is compatible with lower abilities.
  • Figure 4: Level 1 case: STL Decomposition of the activation loss curve for the test set prediction. (a) Standard ML prediction: ML model directly predicts the activation losses using time duration, cell voltage and current density as input features; (b) Knowledge-integrated ML prediction (STL without residual): activation losses are decomposed into multiplicative trend, seasonal, and residual components using prior knowledge. The ML model trains on and predicts the trend, which is then multiplied by the known seasonal pattern.
  • Figure 5: Level 2 Case: Physical-Informed Neural Network for activation loss forecasting with partial knowledge. (a) Vanilla ANN organization. (b) Physical-Informed Network Organization: Integrates embedded prior knowledge and consists of two networks. The physical-informed network with a loss function comparing prior knowledge results and prediction outcomes. The outcomes of both models are combined using weighted sums to produce the final training output, where $\alpha$ adjusts the output weight between the PINN and Correction Model in the loss function.
  • ...and 2 more figures