Dissecting Neural ODEs
Stefano Massaroli, Michael Poli, Jinkyoo Park, Atsushi Yamashita, Hajime Asama
TL;DR
This work provides a system-theoretic formulation of Neural ODEs, clarifying how depth-variance, augmentation, and training interact in continuous-depth models. It introduces two parameter-efficient depth-variant architectures (GalNODE and Stacked NODEs) and extends augmentation with input-layer and higher-order schemes, showing improved performance and efficiency. Moving beyond augmentation, the authors present data-control and adaptive-depth as powerful paradigms to learn complex maps and task-specific computation budgets, demonstrated through theoretical results and practical experiments. Together, these contributions deepen the understanding of continuous-depth models and expand their applicability to tasks requiring flexible depth and conditioning on data.
Abstract
Continuous deep learning architectures have recently re-emerged as Neural Ordinary Differential Equations (Neural ODEs). This infinite-depth approach theoretically bridges the gap between deep learning and dynamical systems, offering a novel perspective. However, deciphering the inner working of these models is still an open challenge, as most applications apply them as generic black-box modules. In this work we "open the box", further developing the continuous-depth formulation with the aim of clarifying the influence of several design choices on the underlying dynamics.
