A Simultaneous Approach for Training Neural Differential-Algebraic Systems of Equations
Laurens R. Lueg, Victor Alves, Daniel Schicksnus, John R. Kitchin, Carl D. Laird, Lorenz T. Biegler
TL;DR
This work addresses training neural differential-algebraic systems by reframing the problem as a large, fully discretized nonlinear program via orthogonal collocation, enabling simultaneous optimization of neural network parameters and DAE state/auxiliary variables under algebraic constraints. The method introduces a structured initialization strategy and Hessian-aware optimization (notably using an L-BFGS approximation for neural terms) to tackle the computational challenges posed by dense neural components. It demonstrates improved accuracy, generalization, and computational efficiency across diverse case studies, including scenarios with sparse data and unobserved states, and shows tighter constraint satisfaction compared to sequential approaches. The results highlight the potential of neural DAEs for hybrid modeling in process systems engineering and suggest concrete future directions for scalability and robustness, such as parallel decomposition and selective Hessian approximations for the neural components.
Abstract
Scientific machine learning is an emerging field that broadly describes the combination of scientific computing and machine learning to address challenges in science and engineering. Within the context of differential equations, this has produced highly influential methods, such as neural ordinary differential equations (NODEs). Recent works extend this line of research to consider neural differential-algebraic systems of equations (DAEs), where some unknown relationships within the DAE are learned from data. Training neural DAEs, similarly to neural ODEs, is computationally expensive, as it requires the solution of a DAE for every parameter update. Further, the rigorous consideration of algebraic constraints is difficult within common deep learning training algorithms such as stochastic gradient descent. In this work, we apply the simultaneous approach to neural DAE problems, resulting in a fully discretized nonlinear optimization problem, which is solved to local optimality and simultaneously obtains the neural network parameters and the solution to the corresponding DAE. We extend recent work demonstrating the simultaneous approach for neural ODEs, by presenting a general framework to solve neural DAEs, with explicit consideration of hybrid models, where some components of the DAE are known, e.g. physics-informed constraints. Furthermore, we present a general strategy for improving the performance and convergence of the nonlinear programming solver, based on solving an auxiliary problem for initialization and approximating Hessian terms. We achieve promising results in terms of accuracy, model generalizability and computational cost, across different problem settings such as sparse data, unobserved states and multiple trajectories. Lastly, we provide several promising future directions to improve the scalability and robustness of our approach.
