Table of Contents
Fetching ...

A minimax optimal control approach for robust neural ODEs

Cristina Cipriani, Alessandro Scagliotti, Tobias Wöhrer

TL;DR

The paper tackles adversarial robustness of neural ODEs by formulating training as a minimax control problem and deriving first-order optimality conditions via Pontryagin's Maximum Principle for a finite data batch. It then interprets the PMP results as an extremal of a smooth, weighted surrogate, enabling a weighted shooting method that adaptively updates perturbation weights. The authors propose and test three weight schemes (uniform, weighted by a Gibbs-like weight, and worst-case) on a 2D classification task, showing that weighting improves stability and that worst-case optimization minimizes the objective but can be volatile. The work provides a control-theoretic foundation for robust training of neural ODEs and suggests scalable weight-update strategies for higher-dimensional problems, offering a principled alternative to empirical risk minimization in adversarial settings.

Abstract

In this paper, we address the adversarial training of neural ODEs from a robust control perspective. This is an alternative to the classical training via empirical risk minimization, and it is widely used to enforce reliable outcomes for input perturbations. Neural ODEs allow the interpretation of deep neural networks as discretizations of control systems, unlocking powerful tools from control theory for the development and the understanding of machine learning. In this specific case, we formulate the adversarial training with perturbed data as a minimax optimal control problem, for which we derive first order optimality conditions in the form of Pontryagin's Maximum Principle. We provide a novel interpretation of robust training leading to an alternative weighted technique, which we test on a low-dimensional classification task.

A minimax optimal control approach for robust neural ODEs

TL;DR

The paper tackles adversarial robustness of neural ODEs by formulating training as a minimax control problem and deriving first-order optimality conditions via Pontryagin's Maximum Principle for a finite data batch. It then interprets the PMP results as an extremal of a smooth, weighted surrogate, enabling a weighted shooting method that adaptively updates perturbation weights. The authors propose and test three weight schemes (uniform, weighted by a Gibbs-like weight, and worst-case) on a 2D classification task, showing that weighting improves stability and that worst-case optimization minimizes the objective but can be volatile. The work provides a control-theoretic foundation for robust training of neural ODEs and suggests scalable weight-update strategies for higher-dimensional problems, offering a principled alternative to empirical risk minimization in adversarial settings.

Abstract

In this paper, we address the adversarial training of neural ODEs from a robust control perspective. This is an alternative to the classical training via empirical risk minimization, and it is widely used to enforce reliable outcomes for input perturbations. Neural ODEs allow the interpretation of deep neural networks as discretizations of control systems, unlocking powerful tools from control theory for the development and the understanding of machine learning. In this specific case, we formulate the adversarial training with perturbed data as a minimax optimal control problem, for which we derive first order optimality conditions in the form of Pontryagin's Maximum Principle. We provide a novel interpretation of robust training leading to an alternative weighted technique, which we test on a low-dimensional classification task.
Paper Structure (12 sections, 6 theorems, 33 equations, 2 figures, 1 table, 1 algorithm)

This paper contains 12 sections, 6 theorems, 33 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Theorem 3

Let $(\bar{u}, \bar{X})$ be a strong local minimizer for the minimax optimal control problem related to eq:def_funct. Then, there exist coefficients $(\gamma_i^j)_{i=1,\ldots,M}^{j=1,\ldots,N}$ satisfying $\gamma_i^j \geq 0$ and $\sum_{j=1}^N \gamma_i^j = 1 \quad \forall i=1, \ldots M$, and there ex such that, for a.e. $t \in [0,T]$, it holds that Moreover, for every $i=1,\ldots,M$, if $\gamma_i^

Figures (2)

  • Figure 1: Classification level-sets on $[0,1]^2$: the color bar indicates the confidence of prediction of one class (red above the yellow margin) or the other class (blue below the yellow margin).
  • Figure 2: Robustness measure displayed on a semilogarithmic scale.

Theorems & Definitions (15)

  • Definition 1
  • Theorem 3: PMP for minimax
  • Remark 4
  • Remark 5
  • Remark 6
  • Lemma 7: Result from book_convex, Lemma 2.1.1 (Part D)
  • Lemma 8
  • proof
  • Definition 9
  • Proposition 10
  • ...and 5 more