Table of Contents
Fetching ...

A Simultaneous Approach for Training Neural Differential-Algebraic Systems of Equations

Laurens R. Lueg, Victor Alves, Daniel Schicksnus, John R. Kitchin, Carl D. Laird, Lorenz T. Biegler

TL;DR

This work addresses training neural differential-algebraic systems by reframing the problem as a large, fully discretized nonlinear program via orthogonal collocation, enabling simultaneous optimization of neural network parameters and DAE state/auxiliary variables under algebraic constraints. The method introduces a structured initialization strategy and Hessian-aware optimization (notably using an L-BFGS approximation for neural terms) to tackle the computational challenges posed by dense neural components. It demonstrates improved accuracy, generalization, and computational efficiency across diverse case studies, including scenarios with sparse data and unobserved states, and shows tighter constraint satisfaction compared to sequential approaches. The results highlight the potential of neural DAEs for hybrid modeling in process systems engineering and suggest concrete future directions for scalability and robustness, such as parallel decomposition and selective Hessian approximations for the neural components.

Abstract

Scientific machine learning is an emerging field that broadly describes the combination of scientific computing and machine learning to address challenges in science and engineering. Within the context of differential equations, this has produced highly influential methods, such as neural ordinary differential equations (NODEs). Recent works extend this line of research to consider neural differential-algebraic systems of equations (DAEs), where some unknown relationships within the DAE are learned from data. Training neural DAEs, similarly to neural ODEs, is computationally expensive, as it requires the solution of a DAE for every parameter update. Further, the rigorous consideration of algebraic constraints is difficult within common deep learning training algorithms such as stochastic gradient descent. In this work, we apply the simultaneous approach to neural DAE problems, resulting in a fully discretized nonlinear optimization problem, which is solved to local optimality and simultaneously obtains the neural network parameters and the solution to the corresponding DAE. We extend recent work demonstrating the simultaneous approach for neural ODEs, by presenting a general framework to solve neural DAEs, with explicit consideration of hybrid models, where some components of the DAE are known, e.g. physics-informed constraints. Furthermore, we present a general strategy for improving the performance and convergence of the nonlinear programming solver, based on solving an auxiliary problem for initialization and approximating Hessian terms. We achieve promising results in terms of accuracy, model generalizability and computational cost, across different problem settings such as sparse data, unobserved states and multiple trajectories. Lastly, we provide several promising future directions to improve the scalability and robustness of our approach.

A Simultaneous Approach for Training Neural Differential-Algebraic Systems of Equations

TL;DR

This work addresses training neural differential-algebraic systems by reframing the problem as a large, fully discretized nonlinear program via orthogonal collocation, enabling simultaneous optimization of neural network parameters and DAE state/auxiliary variables under algebraic constraints. The method introduces a structured initialization strategy and Hessian-aware optimization (notably using an L-BFGS approximation for neural terms) to tackle the computational challenges posed by dense neural components. It demonstrates improved accuracy, generalization, and computational efficiency across diverse case studies, including scenarios with sparse data and unobserved states, and shows tighter constraint satisfaction compared to sequential approaches. The results highlight the potential of neural DAEs for hybrid modeling in process systems engineering and suggest concrete future directions for scalability and robustness, such as parallel decomposition and selective Hessian approximations for the neural components.

Abstract

Scientific machine learning is an emerging field that broadly describes the combination of scientific computing and machine learning to address challenges in science and engineering. Within the context of differential equations, this has produced highly influential methods, such as neural ordinary differential equations (NODEs). Recent works extend this line of research to consider neural differential-algebraic systems of equations (DAEs), where some unknown relationships within the DAE are learned from data. Training neural DAEs, similarly to neural ODEs, is computationally expensive, as it requires the solution of a DAE for every parameter update. Further, the rigorous consideration of algebraic constraints is difficult within common deep learning training algorithms such as stochastic gradient descent. In this work, we apply the simultaneous approach to neural DAE problems, resulting in a fully discretized nonlinear optimization problem, which is solved to local optimality and simultaneously obtains the neural network parameters and the solution to the corresponding DAE. We extend recent work demonstrating the simultaneous approach for neural ODEs, by presenting a general framework to solve neural DAEs, with explicit consideration of hybrid models, where some components of the DAE are known, e.g. physics-informed constraints. Furthermore, we present a general strategy for improving the performance and convergence of the nonlinear programming solver, based on solving an auxiliary problem for initialization and approximating Hessian terms. We achieve promising results in terms of accuracy, model generalizability and computational cost, across different problem settings such as sparse data, unobserved states and multiple trajectories. Lastly, we provide several promising future directions to improve the scalability and robustness of our approach.

Paper Structure

This paper contains 21 sections, 40 equations, 8 figures, 1 table, 1 algorithm.

Figures (8)

  • Figure 1: Tank-Manifold system adapted from koch2024neural.
  • Figure 2: State trajectories of the true system, smooth initialization and hybrid model, with observed data indicated in red (\ref{['fig:4tanks_x']}). Trajectories of the unknown terms (\ref{['fig:4tanks_z']}).
  • Figure 3: State trajectories of the true system and hybrid model on previously unseen height-area profiles (\ref{['fig:4tanks_x_eval']}). Trajectories of the corresponding unknown terms (\ref{['fig:4tanks_z_eval']}). The accuracy of the hybrid model decreased, when compared to training results in Fig. \ref{['fig:4_tank']}. However, note that our approach still produces trajectories which are consistent with the known algebraic constraints \ref{['eq:tank_known_alg']}.
  • Figure 4: Smoothed trajectory of unknown term $z(t)$ (a), and states $x(t)$ (b), for varying smoothing penalties $\lambda_s$. The observations are shown in red.
  • Figure 5: State trajectory of the hybrid systems, which are learned with and without the path constraint enforcing Lyapunov descent (\ref{['fig:pop_states_lyap']}). Including the constraints leads to a more accurate model, especially beyond the point where data is observed. The output of the neural map (\ref{['fig:pop_z_lyap']}) shows similar improvement when compared with the true model.
  • ...and 3 more figures