Table of Contents
Fetching ...

Physics-Informed Neural Networks with Hard Linear Equality Constraints

Hao Chen, Gonzalo E. Constante Flores, Can Li

TL;DR

The paper introduces KKT-hPINN, a physics-informed neural network that guarantees hard linear equality constraints by embedding two non-trainable projection layers derived from Karush-Kuhn-Tucker conditions. This approach solves a small quadratic program to project predictions onto the feasible constraint set, ensuring exact feasibility during both training and inference without extra hyperparameters or post-processing. Across three Aspen Plus–based case studies (CSTR unit, DME-DEE plant, extractive distillation subsystem), KKT-hPINN consistently achieves lower RMSE and near-zero constraint violations compared to unconstrained neural networks and soft-constraint PINNs, demonstrating improved accuracy and robustness, even with reduced training data. The work highlights the method’s applicability as a high-fidelity, physically consistent surrogate modeling tool for process systems engineering, with potential to mitigate error propagation from constraint violations in large-scale integrations.

Abstract

Surrogate modeling is used to replace computationally expensive simulations. Neural networks have been widely applied as surrogate models that enable efficient evaluations over complex physical systems. Despite this, neural networks are data-driven models and devoid of any physics. The incorporation of physics into neural networks can improve generalization and data efficiency. The physics-informed neural network (PINN) is an approach to leverage known physical constraints present in the data, but it cannot strictly satisfy them in the predictions. This work proposes a novel physics-informed neural network, KKT-hPINN, which rigorously guarantees hard linear equality constraints through projection layers derived from KKT conditions. Numerical experiments on Aspen models of a continuous stirred-tank reactor (CSTR) unit, an extractive distillation subsystem, and a chemical plant demonstrate that this model can further enhance the prediction accuracy.

Physics-Informed Neural Networks with Hard Linear Equality Constraints

TL;DR

The paper introduces KKT-hPINN, a physics-informed neural network that guarantees hard linear equality constraints by embedding two non-trainable projection layers derived from Karush-Kuhn-Tucker conditions. This approach solves a small quadratic program to project predictions onto the feasible constraint set, ensuring exact feasibility during both training and inference without extra hyperparameters or post-processing. Across three Aspen Plus–based case studies (CSTR unit, DME-DEE plant, extractive distillation subsystem), KKT-hPINN consistently achieves lower RMSE and near-zero constraint violations compared to unconstrained neural networks and soft-constraint PINNs, demonstrating improved accuracy and robustness, even with reduced training data. The work highlights the method’s applicability as a high-fidelity, physically consistent surrogate modeling tool for process systems engineering, with potential to mitigate error propagation from constraint violations in large-scale integrations.

Abstract

Surrogate modeling is used to replace computationally expensive simulations. Neural networks have been widely applied as surrogate models that enable efficient evaluations over complex physical systems. Despite this, neural networks are data-driven models and devoid of any physics. The incorporation of physics into neural networks can improve generalization and data efficiency. The physics-informed neural network (PINN) is an approach to leverage known physical constraints present in the data, but it cannot strictly satisfy them in the predictions. This work proposes a novel physics-informed neural network, KKT-hPINN, which rigorously guarantees hard linear equality constraints through projection layers derived from KKT conditions. Numerical experiments on Aspen models of a continuous stirred-tank reactor (CSTR) unit, an extractive distillation subsystem, and a chemical plant demonstrate that this model can further enhance the prediction accuracy.
Paper Structure (19 sections, 1 theorem, 7 equations, 11 figures, 6 tables)

This paper contains 19 sections, 1 theorem, 7 equations, 11 figures, 6 tables.

Key Result

Theorem 1

Given any input $\mathbf{\hat{x}} \in \mathcal{R}^{N_0}$, a neural network model $\mathbf{\hat{y}} = \mathrm{NN} (\mathbf{\Theta}, \mathbf{\hat{x}}): \mathcal{R}^{N_0} \rightarrow \mathcal{R}^{N_L}$, and prior knowledge about $m$ equality constraints for the input $\mathbf{\hat{x}}$ and the ground t where

Figures (11)

  • Figure 1: Fully connected feed-forward neural network architectures. The neuron $i$ at the $(l)$th layer is the linear combination of neurons at the $(l-1)$th layer followed by a nonlinear activation $\sigma$, which is $z^{(l)}_i = \sigma(\sum_j w^{(l-1)}_{ij} z^{(l-1)}_j + b^{(l-1)}_{i0})$. The solid red lines represent $w_{ij}^{(l-1)}$ and $b_{i0}^{(l-1)}$ respectively.
  • Figure 2: Illustration: for a given input $\mathbf{\hat{x}} \in \mathcal{R}^{N_0}$, neural network prediction $\mathbf{\hat{y}} \in \mathcal{R}^{N_L}$ is orthogonally projected to be $\mathbf{\tilde{y}} \in \mathcal{R}^{N_L}$ that satisfies a system of linear equality constraints $\mathbf{A} \hat{\mathbf{x}} + \mathbf{B y} = \mathbf{b}$. As an illustrative example, a hyperplane is used here to represent a single constraint where $\mathbf{A} \in \mathcal{R}^{1 \times N_0}, \mathbf{B} \in \mathcal{R}^{1 \times N_L}, \mathbf{b} \in \mathcal{R}^1$.
  • Figure 3: Grey block: illustration of NN architectures; Blue block: illustration of KKT-hPINN architectures consisting of trainable layers and two additional non-trainable projection layers (blue layers). The non-trainable parameters $\mathbf{A}^*$, $\mathbf{b}^*$ and $\mathbf{B}^*$ can be explicitly calculated from \ref{['KKTsolution']}.
  • Figure 4: Process flowsheet of the CSTR unit.
  • Figure 5: Learning curve of the CSTR surrogate models. Left: training RMSE and validation RMSE (RMSE values for PINN are outside the limits); Right: the magnitude of violation of the linear equality constraints.
  • ...and 6 more figures

Theorems & Definitions (6)

  • Theorem 1
  • proof
  • Remark 1: Applicability to other architectures
  • Remark 2: Compatibility to PINN
  • Remark 3: Difference with post projection
  • Remark 4: Loss functions