A Two-Stage Training Method for Modeling Constrained Systems With Neural Networks

C. Coelho; M. Fernanda P. Costa; L. L. Ferrás

A Two-Stage Training Method for Modeling Constrained Systems With Neural Networks

C. Coelho, M. Fernanda P. Costa, L. L. Ferrás

TL;DR

The paper tackles the challenge of enforcing physical or domain constraints in Neural ODE models without the need to tune penalty parameters. It introduces a two-stage training framework that decouples feasibility from optimization: first minimize constraint violations to reach a feasible starting point, then optimize the predictive loss within the feasible region, using a formal equivalence argument to show global minimizers align with the original constrained problem. The authors provide a complete algorithm, analyze computational cost and explainability, and demonstrate substantial improvements in constraint satisfaction and predictive accuracy on World Population Growth and Chemical Reaction datasets, especially under data-sparse conditions. The approach is architecture-agnostic and enhances interpretability by offering a transparent, constraint-guided optimization path with a preference-point strategy to maintain feasibility during refinement.

Abstract

Real-world systems are often formulated as constrained optimization problems. Techniques to incorporate constraints into Neural Networks (NN), such as Neural Ordinary Differential Equations (Neural ODEs), have been used. However, these introduce hyperparameters that require manual tuning through trial and error, raising doubts about the successful incorporation of constraints into the generated model. This paper describes in detail the two-stage training method for Neural ODEs, a simple, effective, and penalty parameter-free approach to model constrained systems. In this approach the constrained optimization problem is rewritten as two unconstrained sub-problems that are solved in two stages. The first stage aims at finding feasible NN parameters by minimizing a measure of constraints violation. The second stage aims to find the optimal NN parameters by minimizing the loss function while keeping inside the feasible region. We experimentally demonstrate that our method produces models that satisfy the constraints and also improves their predictive performance. Thus, ensuring compliance with critical system properties and also contributing to reducing data quantity requirements. Furthermore, we show that the proposed method improves the convergence to an optimal solution and improves the explainability of Neural ODE models. Our proposed two-stage training method can be used with any NN architectures.

A Two-Stage Training Method for Modeling Constrained Systems With Neural Networks

TL;DR

Abstract

Paper Structure (26 sections, 2 theorems, 7 equations, 6 figures, 2 tables, 3 algorithms)

This paper contains 26 sections, 2 theorems, 7 equations, 6 figures, 2 tables, 3 algorithms.

Introduction
Background and Related Work
Neural ODEs
Constrained Optimization Problem
Approaches to Model Constrained Systems with Neural ODEs
The Two-stage Method
Computational Cost
Improved Explainability
Algorithm for Neural ODEs
Numerical Experiments
Performance Analysis
World Population Growth
Chemical Reaction
Experimental Convergence Analysis
World Population Growth
...and 11 more sections

Key Result

Theorem 1

Let $\boldsymbol{\theta}^*$ be a global solution to the constrained problem eq:constrained. Then, $\boldsymbol{\theta}^*$ is also a global solution of the unconstrained sub-problem eq:admissibilitystage and eq:optimizationstage.

Figures (6)

Figure 1: Plots of loss (left) and constraints violation (right), during admissibility stage, for the various tolerance values during training of the models used in experiments 1.0, 2.0 and 3.0.
Figure 2: Plots of loss (left) and constraints violation (right), during admissibility stage, for the various tolerance values during training of the models used in experiments 2.1 and 3.1.
Figure 3: Plots of loss (left) and constraints violation (right), during admissibility stage, for the various tolerance values during training of the models used in experiments 2.2 and 3.2.
Figure 4: Plots of loss (left) and constraints violation (right), during admissibility stage, for the various tolerance values during training of the models used in experiments 1.0, 2.0 and 3.0.
Figure 5: Plots of loss (left) and constraints violation (right), during admissibility stage, for the various tolerance values during training of the models used in experiments 2.1 and 3.1.
...and 1 more figures

Theorems & Definitions (4)

Theorem 1
proof
Theorem 2
proof

A Two-Stage Training Method for Modeling Constrained Systems With Neural Networks

TL;DR

Abstract

A Two-Stage Training Method for Modeling Constrained Systems With Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (4)