Table of Contents
Fetching ...

Training Stiff Neural Ordinary Differential Equations with Explicit Exponential Integration Methods

Colby Fronk, Linda Petzold

TL;DR

The paper addresses the difficulty of training stiff neural ODEs by introducing explicit exponential integration, specifically the integrating factor Euler (IF Euler) method, as a computationally efficient alternative to implicit solvers. It analyzes discretize-then-optimize training, leverages the matrix exponential to stabilize linear stiff dynamics, and demonstrates that IF Euler provides strong stability and efficiency, especially for linear stiff systems and extremely stiff Van der Pol dynamics, albeit at the cost of first-order accuracy. Across four examples, IF Euler outperforms many implicit methods in stability and data-efficiency for linear stiff problems, while showing mixed results for nonlinear cases where higher-order improvements are needed. The findings suggest a promising direction for scalable, differentiable solvers in neural ODEs and potential extensions to stiff PDEs and physics-informed frameworks, with future work focusing on higher-order explicit exponential schemes.

Abstract

Stiff ordinary differential equations (ODEs) are common in many science and engineering fields, but standard neural ODE approaches struggle to accurately learn these stiff systems, posing a significant barrier to widespread adoption of neural ODEs. In our earlier work, we addressed this challenge by utilizing single-step implicit methods for solving stiff neural ODEs. While effective, these implicit methods are computationally costly and can be complex to implement. This paper expands on our earlier work by exploring explicit exponential integration methods as a more efficient alternative. We evaluate the potential of these explicit methods to handle stiff dynamics in neural ODEs, aiming to enhance their applicability to a broader range of scientific and engineering problems. We found the integrating factor Euler (IF Euler) method to excel in stability and efficiency. While implicit schemes failed to train the stiff Van der Pol oscillator, the IF Euler method succeeded, even with large step sizes. However, IF Euler's first-order accuracy limits its use, leaving the development of higher-order methods for stiff neural ODEs an open research problem.

Training Stiff Neural Ordinary Differential Equations with Explicit Exponential Integration Methods

TL;DR

The paper addresses the difficulty of training stiff neural ODEs by introducing explicit exponential integration, specifically the integrating factor Euler (IF Euler) method, as a computationally efficient alternative to implicit solvers. It analyzes discretize-then-optimize training, leverages the matrix exponential to stabilize linear stiff dynamics, and demonstrates that IF Euler provides strong stability and efficiency, especially for linear stiff systems and extremely stiff Van der Pol dynamics, albeit at the cost of first-order accuracy. Across four examples, IF Euler outperforms many implicit methods in stability and data-efficiency for linear stiff problems, while showing mixed results for nonlinear cases where higher-order improvements are needed. The findings suggest a promising direction for scalable, differentiable solvers in neural ODEs and potential extensions to stiff PDEs and physics-informed frameworks, with future work focusing on higher-order explicit exponential schemes.

Abstract

Stiff ordinary differential equations (ODEs) are common in many science and engineering fields, but standard neural ODE approaches struggle to accurately learn these stiff systems, posing a significant barrier to widespread adoption of neural ODEs. In our earlier work, we addressed this challenge by utilizing single-step implicit methods for solving stiff neural ODEs. While effective, these implicit methods are computationally costly and can be complex to implement. This paper expands on our earlier work by exploring explicit exponential integration methods as a more efficient alternative. We evaluate the potential of these explicit methods to handle stiff dynamics in neural ODEs, aiming to enhance their applicability to a broader range of scientific and engineering problems. We found the integrating factor Euler (IF Euler) method to excel in stability and efficiency. While implicit schemes failed to train the stiff Van der Pol oscillator, the IF Euler method succeeded, even with large step sizes. However, IF Euler's first-order accuracy limits its use, leaving the development of higher-order methods for stiff neural ODEs an open research problem.

Paper Structure

This paper contains 14 sections, 20 equations, 11 figures, 11 tables.

Figures (11)

  • Figure 1: The neural network architecture of $\pi$-net V1 from Ref. PiNetPaper is depicted on the left. On the right, a worked example illustrates how a 1-dimensional input layer with the variable $x$ propagates through the network. Circles marked with the $*$ symbol denote layers where the Hadamard product is applied to the layer's inputs. The boxes labeled $L$ represent standard linear layers without activation functions. This architecture does not utilize typical activation functions like tanh or ReLU, which enhances its interpretability.
  • Figure 2: Comparison of the integration of the deterministic stiff van der Pol oscillator with $\mu=1000$ using two different methods: (a) explicit Runge-Kutta-Fehlberg, which is slow with 422,442 time points and 2,956,574 function evaluations, and (b) implicit Radau IIA 5th order, which is faster with only 857 time points and 7,123 function evaluations.
  • Figure 3: Illustration of (a) Discretize-Optimize and (b) Optimize-Discretize methods. For Discretize-Optimize, black and red lines denote the discretized grid used to perform the optimization. For Optimize-Discretize, blue arrows indicate the forward pass of the neural network, while blue lines depict the backward pass using the adjoint method, illustrating how gradients are computed.
  • Figure 4: For the equation $y'=-10000y$, we plot the training dataset consisting of 200 data points uniformly distributed across the time interval.
  • Figure 5: The fractional parameter relative error (non-percentage) is plotted against the number of training data points for the equation $y'=-10000y$. For comparison, we show the explicit exponential integrating factor Euler method method alongside a few implicit schemes.
  • ...and 6 more figures