DeNOTS: Stable Deep Neural ODEs for Time Series

Ilya Kuleshov; Evgenia Romanenkova; Vladislav Zhuzhel; Galina Boeva; Evgeni Vorsin; Alexey Zaytsev

DeNOTS: Stable Deep Neural ODEs for Time Series

Ilya Kuleshov, Evgenia Romanenkova, Vladislav Zhuzhel, Galina Boeva, Evgeni Vorsin, Alexey Zaytsev

TL;DR

DeNOTS addresses the limited expressiveness of Neural CDEs by introducing time scaling to increase the effective depth without inflating weight norms. It further stabilizes long integrations via a novel Anti-NF mechanism, while preserving memory and avoiding the forgetting issues seen in Sync-NF. The authors provide ISS stability and interpolation-error bounds, including GP-based tight bounds for cubic-spline interpolation, and demonstrate up to 20% improvements on four open time-series datasets compared to Neural CDEs, Neural RDEs, and State-Space Models. The approach offers a practical way to harness continuous-time models for irregular time series, combining expressiveness, stability, and robustness in a single framework. Reproducibility is facilitated by publicly available code and thorough experimental documentation.

Abstract

Neural CDEs provide a natural way to process the temporal evolution of irregular time series. The number of function evaluations (NFE) is these systems' natural analog of depth (the number of layers in traditional neural networks). It is usually regulated via solver error tolerance: lower tolerance means higher numerical precision, requiring more integration steps. However, lowering tolerances does not adequately increase the models' expressiveness. We propose a simple yet effective alternative: scaling the integration time horizon to increase NFEs and "deepen`` the model. Increasing the integration interval causes uncontrollable growth in conventional vector fields, so we also propose a way to stabilize the dynamics via Negative Feedback (NF). It ensures provable stability without constraining flexibility. It also implies robustness: we provide theoretical bounds for Neural ODE risk using Gaussian process theory. Experiments on four open datasets demonstrate that our method, DeNOTS, outperforms existing approaches~ -- ~including recent Neural RDEs and state space models,~ -- ~achieving up to $20\%$ improvement in metrics. DeNOTS combines expressiveness, stability, and robustness, enabling reliable modelling in continuous-time domains.

DeNOTS: Stable Deep Neural ODEs for Time Series

TL;DR

Abstract

improvement in metrics. DeNOTS combines expressiveness, stability, and robustness, enabling reliable modelling in continuous-time domains.

Paper Structure (70 sections, 18 theorems, 154 equations, 17 figures, 14 tables, 1 algorithm)

This paper contains 70 sections, 18 theorems, 154 equations, 17 figures, 14 tables, 1 algorithm.

Introduction
Method
Representation learning for non-uniform time series.
Neural CDEs.
Our DeNOTS approach.
Time Scaling
Differentially definable mappings from trajectories $\mathcal{F}$.
Start-corrected differentially-definable mappings,
Lipschitz constraints.
Negative Feedback
No NF.
Sync-NF.
Anti-NF (DeNOTS, ours).
Stability
Error Bounds
...and 55 more sections

Key Result

Theorem 3.1

The classes $\mathring{\mathcal{F}}_g(M_x, M_h)$ and $\mathring{\mathcal{F}}_F(L_x(\cdot), L_h(\cdot))$ are equal, given the following relations between their arguments:

Figures (17)

Figure 1: $R^2$ vs Log-NFEs on the Pendulum dataset for various methods of increasing NFEs: -T for lowering tolerance, -S for increasing time scale; for various vector fields (VF): Tanh --- MLP with $\tanh$ activation; No NF --- vanilla GRU VF, Sync NF --- GRU-ODE VF, Anti NF --- our version. The curves were drawn via Radial Basis Function interpolation.
Figure 2: Scheme of the proposed DeNOTS method. The large red minus sign ($\boldsymbol{\textcolor{red}{-}}$) represents our NF.
Figure 3: AUROC for Bump synthetic dataset results for two vector fields: Tanh-activated and ReLU-activated MLPs. The columns represent different ways to increase NFE: "Default" -- with no modifications, "Tolerance" -- for lowered tolerances, and "Scale" -- for increased time scale.
Figure 4: $l_2$ norms of weights vs time scale $D$ for the $\tanh$ vector field on the Pendulum dataset.
Figure 5: Trajectories for various vector fields on the Pendulum dataset, with $D = 20$. The corresponding $R^2$ values on test are provided in the title above each graph.
...and 12 more figures

Theorems & Definitions (30)

Theorem 3.1: Expressiveness
Definition 4.3
Theorem 4.4
Theorem 4.5
Lemma 4.6
Corollary 4.7
Theorem 4.8
Theorem : \ref{['th:time_constraints']}
proof
Definition B.1
...and 20 more

DeNOTS: Stable Deep Neural ODEs for Time Series

TL;DR

Abstract

DeNOTS: Stable Deep Neural ODEs for Time Series

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (17)

Theorems & Definitions (30)